Maheshwar Reddy

08/07/2023, 6:11 AM
I have a delta table !! When I was going through directories I could see _delta_index folder created for each directory . Its bloating up our delta table. bloom filter was enabled on the delta table and dropped the bloom filter but the directories still persist. Does anyone know how to drop the delta index and associated directories?

Tom van Bussel

08/07/2023, 7:48 AM
Have you tried running the VACUUM command? This command should remove all files that are no longer referenced by the Delta log.
👍 1

Maheshwar Reddy

08/08/2023, 12:14 AM
The issue was with Bloom Filter Index. For every file inserted into delta table a bloom filter index file was created which was more in size. Since we had 80,000 partitions and 100s of thousands of sub files which lead to bloating of table. When I dropped the bloom filter index(since we didn really needed it) and recreate the same table the data size went down from 100 TB to 8 TB