https://delta.io logo
a

Athi

01/06/2023, 2:36 PM
Hi folks, I am trying to run multiple concurrent stream writes to same location but with different partition, I am getting ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. I am running this in append mode with different checkpoint location & different partition but same location. any idea on how to handle this scenario?
q

Quentin Ambard

01/06/2023, 3:23 PM
hum that’s strange, compactions in the background maybe ? are you on Databricks ?
a

Athi

01/06/2023, 3:36 PM
yeah,
q

Quentin Ambard

01/06/2023, 3:38 PM
autocompaction is on ?
a

Athi

01/06/2023, 9:02 PM
I’m with default.
df.writeStream.format('delta').outputMode('append').option('checkpointLocation', <distinctfordifferentStreams>).PartitionBy(['snapshotTimestamp']).start('<same location>')
r

Ryan Zhu

01/06/2023, 10:44 PM
this is a known issue. If you have multiple concurrent streaming writers, it’s better to create the table before running them. Otherwise, all of these writers will try to create the table concurrently and you will see the error like
ProtocolChangedException
a

Athi

01/08/2023, 3:03 PM
just confirming my understanding - we need to create table before initiating multiple concurrent stream. and that table will support concurrent writes.
r

Ryan Zhu

01/09/2023, 1:42 PM
Once you create the table before initiating multiple concurrent stream, the table should support append-only concurrent writes in any cases. See https://docs.delta.io/latest/concurrency-control.html for non append-only writes.
a

Athi

01/09/2023, 4:56 PM
Will check it out. Thank you...
8 Views