Ritesh Malav

08/03/2023, 5:00 AM
Team, we are using deltalake write with
mode = overwrite
replaceWhere = (hour)
Now there are multiple writes happening to this deltalake table from multiple spark jobs in parallel which are trying to replace data in parallel for different metering_hours. I am observing that when we are not using deltalake these write were very fast but now with deltalake they have become slow. Does anyone know of this behaviour ? It’s like even though each write is modifying different partition but since there are 100 writes happening in parallel they are maybe waiting on each other to finish writing to the deltalog or something else maybe