Rahul Sharma
01/24/2023, 11:10 AMAn error was encountered:
An error occurred while calling o86.sql.
: org.apache.spark.sql.delta.DeltaUnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same
target row in the Delta table in possibly conflicting ways. By SQL semantics of Merge,
when multiple source rows match on the same target row, the result may be ambiguous
as it is unclear which source row should be used to update or delete the matching
target row. You can preprocess the source table to eliminate the possibility of
i have below data for raw and refine and performing merge ,i have manually verify data is not duplicate or same in raw and refine
raw-data
+--------+--------------+-------------+------------+--------------------+-------------------+------------------+--------------+-----------------------+-----------------------+------------------+-----------+---------------------+----+---------+--------
|10|10000.0000 |10000 |0 |1000.0000 |0.0000 |0 |156 |2021-12-08 17:12:59.457|2023-01-23 17:44:13.6 |null |null |true |u |1674476053600 |false
|11|200000.0000 |10002 |0 |9990.0000 |0.0000 |0 |1 |2021-12-16 18:40:12.16 |2023-01-24 13:18:44.047|null |null |true |u |1674546524047 |false
+--------+--------------+-------------+------------+--------------------+-------------------+------------------+--------------+-----------------------+-----------------------+------------------+-----------+---------------------+----+---------+--------
refine-data
+--------+--------------+-------------+------------+--------------------+-------------------+------------------+--------------+-----------------------+-----------------------+------------------+-----------+---------------------+----+---------+----------
|10|10000.0000 |10000 |0 |1000.0000 |0.0000 |0 |156 |2021-12-08 17:12:59.457|2023-01-23 15:55:10.71 |null |null |true |u |1674469510710 |false
|11|200000.0000 |10000 |0 |9990.0000 |0.0000 |0 |1 |2021-12-16 18:40:12.16 |2023-01-23 11:07:57.187|null |null |true |u |1674452277187 |false
+--------+--------------+-------------+------------+--------------------+-------------------+------------------+--------------+-----------------------+-----------------------+------------------+-----------+---------------------+----+---------+-----------
please look into it pro-activelyKashyap Bhatt
01/24/2023, 2:57 PMRahul Sharma
01/24/2023, 4:07 PMNick Karpov
01/24/2023, 4:21 PMraw
and refine-data
that cause the problem? can you share the exact query thenRahul Sharma
01/24/2023, 4:29 PM%%sql
MERGE INTO delta_test.test_refine v
USING raw u
ON v.userID=u.userID
WHEN MATCHED AND (u.__op = "d" and u.__deleted='false')
THEN DELETE
WHEN MATCHED AND u.__op = "u"
THEN UPDATE SET *
WHEN NOT MATCHED AND (u.__op = "c" or u.__op = "r")
THEN INSERT *
Kashyap Bhatt
01/25/2023, 4:53 PMRahul Sharma [5:10 AM]
please look into it pro-actively
Kashyap Bhatt [8:57 AM]
Can you post a reproducible example? code snippet..
Rahul Sharma [10:07 AM]
i have a cdc platform so i can’t provide reproducible examplePerhaps someone with a vested interest, like Databricks folks, have time and are willing to spend it on this without the info.. I don't unfortunately.
Nick Karpov
01/25/2023, 5:21 PMi found the issue if we have multiple data of same userid in one batch then this will give an error.@Rahul Sharma awesome glad you figured it out... to my eyes this is exactly what the error message indicated since the start, but is there something we can change that it would have been more clear initially?