Rajesh Tomar
04/29/2023, 7:54 AMgeo1/
year1/
month1/
date1/
hour1/
min1
geo2/
geo3/
geo4/
In one of the applications(let's say for geo4), I'm observing an issue. I'm seeing way too many consecutive jobs titled:
Delta: Compute snapshot for version: <snapshot version>.
There are usually 20-30 such snapshot computation jobs each taking about a minute. This ends up making a single write essentially take about 30mins +- 10mins while the actual processing and saving within the job takes ~60s. The reason there are multiple snapshot computation jobs is because: while geo4 application is computing snapshot for version1, other applications write more versions(v2, v3, v4...). Now, geo4 has to catchup with them by computing snapshots for versions v2,v3,v4 in which time, other applications may write more new versions.
Interestingly, the snapshot computation used to take much lesser earlier.
I tried running a multi-cluster simulation where such snapshot computation jobs only take about a second.
I'm curious why the SnapshotState computation takes so much time. Has anyone seen something similar earlier? My suscpicion is that the deltalog is now too big for my table with more than 330k entries.