https://delta.io logo
y

Yan Zhao

04/01/2023, 3:50 PM
Hello, I have a question about the ConflictChecker. I see the code:
Copy code
def checkConflicts(): Unit = {
    checkProtocolCompatibility()
    checkNoMetadataUpdates()
    checkForAddedFilesThatShouldHaveBeenReadByCurrentTxn()
    checkForDeletedFilesAgainstCurrentTxnReadFiles()
    checkForDeletedFilesAgainstCurrentTxnDeletedFiles()
    checkForUpdatedApplicationTransactionIdsThatCurrentTxnDependsOn()
    reportMetrics()
  }
I want to know how dose
checkNoMetadataUpdates()
work. Here I give a case to describe it. I have two writers, both of them will to update the metadata. Before update, the snapshot version is 1. Both of them commit the updateSchema action to the table, and the target metadata structure is same. The first writer commit 2.json succeed, the second writer commit 2.json failed, then the second writer will do the retry commit, and check the conflicting. In this case, dose the second writer
checkNoMetadataUpdates()
will throw
MetadataChangedException
?
s

Scott Sandre (Delta Lake)

04/08/2023, 3:13 AM
If the 2nd writer sees that the first writer made any metadata updates (that the 2nd writer didn’t see), then the 2nd writer will fail. You can check out the code yourself too! It’s pretty clearly documented! Great question! https://github.com/delta-io/delta/blob/master/core/src/main/scala/org/apache/spark/sql/delta/ConflictChecker.scala
y

Yan Zhao

04/09/2023, 3:09 AM
Thanks, I have checked the code.
4 Views