https://delta.io logo
g

guru moorthy

07/03/2023, 10:12 PM
Hi Team, How to store the data in sorted order in deltalake ?
c

chris fish

07/03/2023, 10:48 PM
delta currently doesn’t support bucketing, which is the spark feature that allows for pre-sorted files
if you just want to order the data on some columns, you can use
sort
or
orderBy
to ensure your data has a specific ordered layout. but this wont have the effect of bucketing on the downstream operations that read that data (faster joins/aggregations). delta supports min/max based file skipping, which can speed up filters for sorted or co-located data. Zorder uses clustering algorithms to produce files that benefit more from this file skipping
whats the reason you want to sort it?
s

Sherlock Beard

07/04/2023, 3:13 AM
why not zorder ?
j

Jacek

07/04/2023, 8:55 AM
zorder is not about sorting but placing similar data points in same data files
👍 2