https://delta.io logo
r

rtyler

05/11/2023, 5:01 PM
FYI this pull request is a pre-requisite for the next datafusion and arrow upgrades since the old
Decoder
is removed. I have a follow up change to make both those upgrades possible
šŸ‘ 1
šŸ™Œ 4
Thanks for the feedback @Will Jones, I'll clean this pull request up a little bit, and then I'm hoping to close the week with another release for the rust bindings if everybody is on board
So
Decoder
was deprecated, and then replaced with a different API in 38 that isn't deprecated anymore. arrow is wacky y'all
w

Will Jones

05/12/2023, 4:11 PM
Yeah I wish they had put in some guidance in the release notes on which API to use instead
Otherwise just have to follow the breadcrumbs in the PR and issue comments
r

rtyler

05/12/2023, 4:15 PM
heh, indeed
breadcrumbs followed though, I think we're in good shape now
@Will Jones for the checkpoints comment, I might be missing the documentation but I cannot see how
Decoder
takes a defined
batch_size
oh guffaw, ReaderBuilder can build a decoder and a reader
w

Will Jones

05/12/2023, 4:23 PM
jsons
is an iterator, right? so you could use
Itertools::chunks()
on it do handle batch size maybe? Unless you just found a better way
r

rtyler

05/12/2023, 4:25 PM
no I didn't, I was just having the same thoughts. I have no idea what
with_batch_size
would do with
serialize
since it's not clear what would happen if
jsons
still has elements but the batch size has been met. I think chunking
jsons
is a reasonable approach to keep these batches smallish
šŸ‘ 1
In the breadcrumbs you were reading, did you see anything that could indeicate we can just call Decoder.flush in a loop?
w

Will Jones

05/12/2023, 4:27 PM
Not sure; I’d have to look deeper into the API
r

rtyler

05/12/2023, 6:00 PM
šŸ‘€ 1