https://delta.io logo
r

rtyler

03/22/2023, 1:01 AM
The more time I spend with the arrow crate the more convinced that I am we should switch to arrow2 by default as soon is it has the necessary support we need. It's insanely complicated and has a fair bit of
unsafe
floating around I am running into more and more. ๐Ÿคฆ ๐Ÿ™€
hah, found delta.rs in the arrow repo. Not what I expected though ๐Ÿ˜†
๐Ÿ˜† 2
r

Robert

03/22/2023, 6:44 AM
I think arrow-rs and arrow2 are in the process of merging or at least becoming compatible. https://github.com/jorgecarleitao/arrow2/issues/1429. From what i understand is the main idea is to for now make arrow2 and arrow-.rs arrays compatible, by arrow-rs adopting the lo-level, safe implementations from arrow2.
r

rtyler

03/22/2023, 3:00 PM
ohno ๐Ÿ˜†
๐Ÿคฃ 2
w

Will Jones

03/22/2023, 5:56 PM
Thereโ€™s definitely lots of APIs that need improving, but in my experience the maintainers in arrow-rs are very responsive to improvement ideas. So itโ€™s probably worth creating issues on the repo if you find friction, especially if there is a solution prototyped in arrow2 ๐Ÿ™‚
๐Ÿฆ€ 2
r

rtyler

03/22/2023, 7:41 PM
Unfortunately because there's no writer support as far as I can tell in arrow2, there is no parallel. The underlying problems that I am running into is that the RecordBatch API and underlying arrow:array:* stuff is nightmarishly complex if you need to do any data manipulation
I was just sitting down this evening to work on implementing ... what @Will Jones already implemented! take for MapArray in arrow ๐Ÿฅณ
๐ŸŽ‰ 1
3 Views