https://delta.io logo
t

Thanhtan Le

09/18/2023, 6:27 AM
Hello team, I have a question for delta lake table. I use SPARK SQL run MERGE query, query have insert and update only, not contain delete statement. Operation metric is in image. After check log between before delta version and after delta version by COUNT() query. I see that some records had been removed [4314 records removed], it is match with “numSourceRows” metrics. Why it is happen ?
Total records new version
6459
lower than older version
6458
. Difference
MERGE (INSERT - UPDATE)
between two versions
p

Paddy

09/18/2023, 9:46 AM
Could you paste the query you ran, and a screenshot of the entire row of history of version 6459?
t

Thanhtan Le

09/18/2023, 9:52 AM
This is query I run
p

Paddy

09/18/2023, 9:56 AM
And the entire row of version 6459?
t

Thanhtan Le

09/18/2023, 9:58 AM
IMG_9444.jpg,IMG_9445.jpg,IMG_9443.jpg
p

Paddy

09/18/2023, 10:01 AM
This is suspicious… I see you use DBR not Delta.io. Could you contact your Databricks support to investigate? The reason is DBR has its own MERGE algorithm.
👍 1
t

Thanhtan Le

09/18/2023, 10:09 AM
perfectly, I will contact to DBR team to check that