Also any limitations in the "vacuum" dry run support? Running this on one of our large tables shows 44K files to remove, while the Spark based version shows 24.4M files to remove.
r
rtyler
05/25/2023, 6:46 PM
well that's quite interesting, are there any special characteristics of the table worth mentioning other than Bigness
m
Michael Nacey
05/25/2023, 8:07 PM
It's in S3, it is partitioned by two columns, minReaderVersion 1, minWriterVersion 2, it's an append table
🤔 1
It' worth noting that when we did not specify "retention = 0 and do not enforce" (i.e. let it use the default retention settings) it shows zero files to be deleted
r
Robert
05/25/2023, 8:30 PM
would it be possible to do a dry run on both while explicitly setting the retention? Maybe something is off with the defaults, or parsing the table config.
m
Michael Nacey
05/25/2023, 8:55 PM
we did that and got the numbers in the original post 😉