Hugo Saavedra
07/28/2023, 8:09 PMdelta-rs
or is there more of a tendency to "greenfield" things to the extent that it makes sense, and only look to satisfy the letter of the spec rather than parity with other implementations?
3. do these tuesday meets still happen? https://github.com/delta-io/delta-rs#development-meetingWill Jones
07/28/2023, 8:22 PM1. or are there issues blocking it/design issues that need to be worked out still?For Rust, no blockers right now. For Python, we need to refactor the API a bit first. I'll create an issue describing that in a little bit.
2. do folks typically look to the scala/spark delta implementation to maintain parity and guide implementationWe don't try to imitate their APIs or implementation, but they are good inspiration for test cases, since they've seen quite a few real world bug reports. There is a project called Data Acceptance Tests (DAT) where we are trying to collect test cases that all connectors can use. But it's still a bit early.
3. do these tuesday meets still happen?I think we recently stopped these for now in favor of a different meeting for delta-kernel-rs.
Hugo Saavedra
07/28/2023, 9:00 PMdelta-rs
?Will Jones
07/28/2023, 9:36 PMI was thinking Rust but I'm motivated mainly by wanting to learn more about the delta format and delta internalsThen I'd say go right ahead. There's no blockers there at the moment.
my understanding was that Python was bindings-onlyPython uses the Rust implementation for interacting with the delta log, but uses scanners and writers from PyArrow for interacting with the data files. This is because when we started the Parquet scanners and writers in Rust weren't very mature (didn't have support for things like predicate pushdown or partitioning written data). We'll be refactoring it soon to switch over to wrapping Rust for data file interaction too.
Hugo Saavedra
07/28/2023, 9:42 PMIon
07/29/2023, 12:09 AMWill Jones
07/29/2023, 4:46 PM