https://delta.io logo
r

rtyler

02/14/2023, 7:09 PM
This feels like a relatively simple question, but when I ALTER a table, what is actually happening at the protocol level?
s

Scott Sandre (Delta Lake)

02/14/2023, 7:19 PM
a new
Metadata
action is committed to the table. commit --> a new
N.json
delta log commit file is appended to the
_delta_log
. this new metadata action will contain the new schema or latest table properties
r

rtyler

02/14/2023, 7:23 PM
ah, so the whole new schema gets committed as a new action, and then the readers are expected to basically just use the latest metadata action in the log right?
s

Scott Sandre (Delta Lake)

02/14/2023, 7:24 PM
yup!
r

rtyler

02/14/2023, 7:24 PM
in the spark implementation is there any form of ALTERs which would cause parquet file modifications? I know the schemas of the parquet file and the delta log needn't be in lock step
c

chris fish

02/14/2023, 7:48 PM
nope
only modifies
Metadata
i think the approach is the opposite, it treats the parquet files as immutable, and if you try to make a schema modification that doesn’t work, it fails
r

rtyler

02/14/2023, 7:49 PM
without reading all the parquet files I don't know how it would possibly do that
c

chris fish

02/14/2023, 7:51 PM
i believe because its actively tracking the super schema in the metadata
so it doesn’t need to constantly verify with the data files
3 Views