https://delta.io logo
r

Rajesh Tomar

04/29/2023, 7:54 AM
Hi Team, I have a question. I have a production setup where 4 different geo-distributed spark clusters each running a spark structured streaming application write into a single delta lake table. The table has partitions for each geo within which, there are partitions for year, month, date, hour and min. So the table layout looks like:
Copy code
geo1/
 year1/
  month1/
   date1/
    hour1/
     min1
geo2/
geo3/
geo4/
In one of the applications(let's say for geo4), I'm observing an issue. I'm seeing way too many consecutive jobs titled: Delta: Compute snapshot for version: <snapshot version>. There are usually 20-30 such snapshot computation jobs each taking about a minute. This ends up making a single write essentially take about 30mins +- 10mins while the actual processing and saving within the job takes ~60s. The reason there are multiple snapshot computation jobs is because: while geo4 application is computing snapshot for version1, other applications write more versions(v2, v3, v4...). Now, geo4 has to catchup with them by computing snapshots for versions v2,v3,v4 in which time, other applications may write more new versions. Interestingly, the snapshot computation used to take much lesser earlier. I tried running a multi-cluster simulation where such snapshot computation jobs only take about a second. I'm curious why the SnapshotState computation takes so much time. Has anyone seen something similar earlier? My suscpicion is that the deltalog is now too big for my table with more than 330k entries.
5 Views