I wrote some 60 million records to a S3 and created Delta table on top of it.
Initially, i could see all files are created of 128mb size
When i run Merge on day 2 which has updates and inserts, all files merged to single file of size 786mb.
If i run OPTIMIZE tableName ZORDER BY columnName, i could see files are created back with 128MB,
These are some of Spark configs i use
I am unable to find what's kicking off this file merge to single file.Can you please give me an idea.