03/23/2023, 9:04 PM
Hi all, Could you please clarify below scenario. I have been going through different videos and documents. Answers for below questions will help a lot for me. If i create partitioned Delta table, and run Merge data into Delta table and where clause is having Partition & Primary column, will look up go through all the files in the Partition? Will data skipping stats help here? Will running "Optimize deltatable ZORDER by (X,Y)" speed up this lookup? Do i need to run this command after every Merge?
👀 1

Dhruvil Shah

03/24/2023, 12:15 AM
Same question I have

Calili Santos

03/24/2023, 1:07 AM
Hi Chandra, i run a optimize to some tables (by a Databricks job) once in a week, and get awesome results. And some of this tables had ingestions like your scenario but we're happy with this schedule.
🙌 1


03/24/2023, 6:57 AM
@Calili Santos @Dhruvil Shah Thank you Santos. May i know few stats one big table 1. Total rows 2. Number of rows in one of the partition 3. How much time updating or finding a record in that partition takes. Basically i want to understand, other than UPSERT capability, does delta lake faster lookup for a given record power?