https://delta.io logo
d

Dip

04/17/2023, 5:27 AM
Hi All - I'm facing problem with Pandas 2.0.0 release wherein when I call toPandas() on a dataframe it's failing here is a link πŸ”— that produces the issue - https://stackoverflow.com/questions/75804781/how-to-create-pyspark-dataframes-from-pandas-dataframes-with-pandas-2-0-0 . Now , I have moved to Pyspark 3.4.0 , now while doing so again it's failing as Latest Pyspark is not compatible with Delta 2.3.0 . Is there a workaround? I just saw databricks runtime 13.0 where they are using Pyspark 3.4.0 with delta 2.30 - Just wondering how can we achieve the same in local? https://docs.databricks.com/release-notes/runtime/13.0.html
z

Zach

04/17/2023, 3:21 PM
IIRC databricks delta is not the same as OSS delta lake (unfortunately!)
πŸ€• 1
d

Dip

04/17/2023, 3:22 PM
Just wondering πŸ€” I'm the only person who's having this issue πŸ˜‘
z

Zach

04/17/2023, 3:26 PM
It’s also worth noting that pandas 2.0.0 is still not supported in DBR13 from the page you linked:
pandas from 1.4.2 to 1.4.4
You can of course try and install pandas 2.x, but a major version change like that may (and appears to already be) cause issues and is unsupported.
z

Zach

04/17/2023, 3:33 PM
Spark 3.4 supporting pandas 2.x is not the same as databricks 13.x supporting pandas 2.x
d

Dip

04/17/2023, 3:34 PM
Yeah 😁 I essentially understood πŸ˜€ by now while I was doing my local set-up
z

Zach

04/17/2023, 3:36 PM
Yeah, well I understand the pain! Some of these new releases can be quite exciting. I know I’m very interested in seeing how pandas 2.x can change the data landscape!
d

Dip

04/17/2023, 3:37 PM
Well, essentially I didn't want to migrate to Pandas 2.0.0 The BlackDuckHub scan reported security vulnerabilities for Pandas 1.5.3 and I had to make a bump up ...
8 Views