Hi Team , will i get duplicate records if i re-ran partial failed merge command (some records have been updated or inserted) on the same target delta table but source delta lake gets updated each time.
Assume the following scenario :
Note: Each record contains the primary key based on which merge is performed.
In Run-1, we have source S1 with 10 records ( 5- update , 5- insert ) and Target with 100 records after merge, target delta lake will contain 105 records.
In Run-2, we have source S2 with 5 records ( 2-update , 3- insert) and Target with 105 records and merge have failed after writing partially ( 2 records have been inserted and 1 record have been updated ).
As above Run-2 have failed, we re-ran the job again. But this time S2 (source delta table got updated) will contain 7 records ( 3-update,4-insert) which includes the previous 5 records (2- update , 3- insert) + current 2 records ( 1 -update, 1- insert ) . So if i do merge now on the target delta table, will it insert the already inserted records and create duplicates ?
Any help on this issue ?