John Darrington
06/09/2023, 4:53 PMWill Jones
06/09/2023, 4:56 PMbut for DataFusion to run a query against a parquet/csv file, does it have to load entire file into memory?No, it will try to only load the parts of the files it needs to. There's a good blog post diving into how it can do this for Parquet: https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/
John Darrington
06/09/2023, 5:30 PM