Matthew Powers
02/21/2023, 4:12 PMDeltaTable("../rust/tests/data/simple_table", version=2)
• delta-spark: DeltaTable.forPath(spark, "/path/to/table")
- no version argument available
Are there any implications of this difference we should think about?Will Jones
02/21/2023, 4:17 PMDeltaTable
class we have has methods that make sense to call on past versions.DeltaTable
, and instead keep as separate functions (or maybe another class)Matthew Powers
02/21/2023, 6:23 PMDeltaTable("../rust/tests/data/simple_table", version=2).optimize()
would be weird. I’m just wondering if DeltaTable("../rust/tests/data/simple_table", version=2)
would ever back us into a corner. Perhaps with something like deletion vectors if they don’t count as a “new version” (I’m not saying that’s the case, just brainstorming out loud).Will Jones
02/21/2023, 6:26 PMDeltaTable
represents a table at some particular time, and not the table in general. The only implication I can think of right now is that we shouldn’t implement operations on the table as methods, but that the methods should just be for extracting information from the log.Perhaps with something like deletion vectors if they don’t count as a “new version” (I’m not saying that’s the case, just brainstorming out loud).It should always be sound. Any change to the table, no matter how small, creates a new transaction / log file / version.
Matthew Powers
02/21/2023, 6:28 PMrtyler
02/21/2023, 7:33 PMYeah I think the distinction to make is that DeltaTable represents a table at some particular time, and not the table in general.FWIW I think this is a good decision we made in delta-rs and think it's a missed opportunity in delta-spark. (I like OOP but sometimes it encourages what might otherwise be silly decisions)