https://delta.io logo
d

David Blajda

04/23/2023, 2:20 AM
With Datafusion, is it possible to determine which file a particular record came from? Something similar to
input_file_name
from Spark would be beneficial since it would help with determining the origin a record which can be used to support delete, update, and merge operations.
πŸ’‘ 2
πŸ‘€ 2
w

Will Jones

04/23/2023, 2:45 AM
I don’t think so. I think we just had a very similar conversation in GitHub here: https://github.com/delta-io/delta-rs/issues/850#issuecomment-1513634486
πŸ‘ 1
d

David Blajda

04/23/2023, 3:02 AM
Ah thanks for the insight. Yeah we would need to do on per file basis for now then
j

Jim Hibbard

04/24/2023, 5:00 PM
This would be such a nice feature! Do we have a rough idea of how difficult this would be to add? Like are all the necessary pieces in the Datafusion library present or do we need to wait for other features first?
n

Nick Karpov

04/24/2023, 8:17 PM
@David Blajda please add a thumbs up on https://github.com/apache/arrow-datafusion/issues/6051 πŸ˜„
πŸ‘ 2
j

Jim Hibbard

04/24/2023, 8:19 PM
Sweet, thanks Nick!
3 Views