06/06/2023, 4:13 AM
Hi everyone, I am facing an issue with two jobs that are writing to a delta location. Here are the details: • Job 1: This job runs hourly and uses the overwrite mode to write data. After writing, it executes the vacuum operation. The purpose of this job is to update recently dated data. • Job 2: This is a backfill job that processes some past dates. It also uses the overwrite mode, but it does not execute the vacuum operation since the hourly job already takes care of it. The problem occurs when both jobs run simultaneously. It leads to data conflicts, and as a result, the data overwritten by Job 2 becomes inaccessible. Error message:
"File does not exist: /delta_path/part-00068-23256f70-a297-416f-8c5d-3650343929b3.c000.zstd.parquet"
Please let me know if you have any suggestions or solutions to resolve this issue (execute 2 job at the same time without conflict).