Few community updates:
• I gave a presentation to the Dask community showcasing deltadask. Here’s
the 10 minute recording▾
if you’re interested. The Dask community is very excited about Delta Lake.
• I will chat with a core member of the cuDF team on Monday. They have a bunch of users that are also asking for Delta Lake support.
• I will be presenting about why Delta Lake is a great file format for pandas at the Data + AI Summit next month. I will be showing how file skipping allows for certain queries to run much faster. Thanks to this PR from @shingo, we will be able to showcase a more beautiful syntax. It would be ideal if someone could implement this Z Order feature, so we could highlight this at the talk as well. Even a naive Z Order implementation will have huge data skipping opportunities for the pandas community and should make them quite excited.
delta io 2
05/21/2023, 7:56 PM
I think I'll look at adding a Sort-based optimize as a first step towards z-order.