Jeremy Jordan
04/20/2023, 12:53 PMConcurrentAppendException
error. Is this expected? Do I need to implement retries in my write logic? I figured that after enabling multi-cluster setup that a cluster would wait for the lock to be released and then write a new transaction, I didn't think I would have to worry about retries. Do I have something misconfigured?ConcurrentAppendException: Files were added to partition [event_date=2023-04-14] by a concurrent update. Please try the operation again.
Conflicting commit:
{
"timestamp": 1681939596227,
"operation": "OPTIMIZE",
"operationParameters": {
"predicate": [
"(event_date = '2023-04-14')"
],
"zOrderBy": [
"customer_id"
]
},
"readVersion": 5,
"isolationLevel": "SnapshotIsolation",
"isBlindAppend": false,
"operationMetrics": {
"numRemovedFiles": "...",
"numRemovedBytes": "...",
"p25FileSize": "...",
"minFileSize": "...",
"numAddedFiles": "...",
"maxFileSize": "...",
"p75FileSize": "...",
"p50FileSize": "...",
"numAddedBytes": "..."
},
"engineInfo": "Apache-Spark/3.2.0-amzn-0 Delta-Lake/2.0.0",
"txnId": "..."
}
Nick Karpov
04/20/2023, 9:27 PMIs this expected? Do I need to implement retries in my write logic?yup, check the matrix here https://docs.delta.io/latest/concurrency-control.html#write-conflicts the multi cluster setup provides atomicity for the actual commit operation (not a lock over the entire transaction) for when two writers of the same commit (
00x.json
) occur at the exact same time. without this setup, one commit will overwrite the other, and the table will be corruptedJeremy Jordan
04/20/2023, 9:55 PM