https://delta.io logo
v

Vishal Kadam

02/03/2023, 6:58 PM
Hi All, Is the delta merge function async? Means it will execute the next line of code without waiting for the merge operation to complete.
j

JosephK (exDatabricks)

02/03/2023, 7:02 PM
Typically, things are done in order. If you’re using spark and submit an application, it doesn’t run it all at once but from the top down.
d

Dominique Brezinski

02/03/2023, 7:23 PM
Definitely not async. why do you ask?
v

Vishal Kadam

02/04/2023, 3:09 AM
@Dominique Brezinski I observed this while running spark on AWS lambda and it has spark along with delta library. The scenario where we were running a merge on the dataframe into S3 delta format table and in the next step we are putting records in the dynamodb table using the same dataframe. I observed that data was not in the delta table but present in Dynamodb table.
Also I could not deep drive into it because it does not give us the spark UI.
d

Dominique Brezinski

02/04/2023, 3:11 AM
That is likely because your merge conditions did not addend or update data
so merge ran to completion and then the ddb write occurred
v

Vishal Kadam

02/04/2023, 3:14 AM
But it should add the data into the S3 delta table....and then write to ddb... However I was not able to deep dive into it
d

Dominique Brezinski

02/04/2023, 3:16 AM
It won’t add the data to the s3 delta table if the merge logic does not trigger an insert, update, or delete. I am saying the merge completed before the ddb write, but the merge did not do what you were expecting I suspect.
v

Vishal Kadam

02/04/2023, 3:18 AM
Okay .... Understood
d

Dominique Brezinski

02/04/2023, 3:20 AM
That is my best guess. Spark code runs sequentially.
g

Gerhard Brueckl

02/06/2023, 8:14 AM
maybe you simply forgot the
.execute()
at the end of your merge function? (happened to me once and took me ages to find it)
v

Vishal Kadam

02/07/2023, 5:22 AM
I used .execute()
4 Views