05/25/2023, 6:42 PM
Also any limitations in the "vacuum" dry run support? Running this on one of our large tables shows 44K files to remove, while the Spark based version shows 24.4M files to remove.
05/25/2023, 6:46 PM
well that's quite interesting, are there any special characteristics of the table worth mentioning other than Bigness
05/25/2023, 8:07 PM
It's in S3, it is partitioned by two columns, minReaderVersion 1, minWriterVersion 2, it's an append table
It' worth noting that when we did not specify "retention = 0 and do not enforce" (i.e. let it use the default retention settings) it shows zero files to be deleted
05/25/2023, 8:30 PM
would it be possible to do a dry run on both while explicitly setting the retention? Maybe something is off with the defaults, or parsing the table config.
05/25/2023, 8:55 PM
we did that and got the numbers in the original post 😉