Matthew Powers

05/20/2023, 3:58 PM
Few community updates: • I gave a presentation to the Dask community showcasing deltadask. Here’s

the 10 minute recording

if you’re interested. The Dask community is very excited about Delta Lake. • I will chat with a core member of the cuDF team on Monday. They have a bunch of users that are also asking for Delta Lake support. • I will be presenting about why Delta Lake is a great file format for pandas at the Data + AI Summit next month. I will be showing how file skipping allows for certain queries to run much faster. Thanks to this PR from @shingo, we will be able to showcase a more beautiful syntax. It would be ideal if someone could implement this Z Order feature, so we could highlight this at the talk as well. Even a naive Z Order implementation will have huge data skipping opportunities for the pandas community and should make them quite excited.
Will Jones

05/21/2023, 7:56 PM
I think I'll look at adding a Sort-based optimize as a first step towards z-order.
