https://delta.io logo
j

jacktby jacktby

08/09/2023, 6:20 AM
hi, for delta_lake's merge_into grammar, the document says we need to specify the source, it can just be a normal table, it can be an external table or other things like data interface?
t

Tom van Bussel

08/09/2023, 6:44 AM
The source can be anything that you can turn into a DataFrame (when using the Scala or Python API) or anything that you can turn into a SQL query (when using the SQL API). It could even be the result of joining a 100 tables.
j

jacktby jacktby

08/09/2023, 9:12 AM
so the source will be a complete component forever,never split it?
t

Tom van Bussel

08/09/2023, 9:29 AM
I’m not sure if I understand the question. Could you rephrase?
j

jacktby jacktby

08/09/2023, 9:33 AM
I think if the source is a query, you will treat the whole query as a complete source or try to use join and the pass it to the optimizer?
d

Dominique Brezinski

08/09/2023, 4:08 PM
Your question is still not clear. Internally merge does two joins between the source and the target table, and those are indeed passed through the optimizer. The source is a DataFrame and it corresponding lineage and properties.
j

jacktby jacktby

08/10/2023, 12:42 PM
why not use only once join but use two joins? Ae there some special reasons?