https://delta.io logo
m

Matthew Powers

07/20/2023, 4:12 PM
@Will Jones - here’s a notebook that shows how to query a Delta table with DuckDB. From what I recall, there is a bad way of running this query and an efficient way of running the query. I’d like to add the bad example & good example in this notebook. Can you send me some pseudocode of the bad/good, so I can understand this better?
w

Will Jones

07/20/2023, 4:18 PM
It would be something like Bad
Copy code
table = dt.to_pyarrow_table()
quack = duckdb.arrow(table)
quack.filter("id1 = 'id016' and v2 > 10")
Better
Copy code
table = dt.to_pyarrow_table(filter=(ds.field("id1") == 'id016') & (ds.field('v2') > 10))
quack = duckdb.arrow(table)
quack
Best
Copy code
dataset = dt.to_pyarrow_dataset()
quack = duckdb.arrow(dataset)
quack.filter("id1 = 'id016' and v2 > 10")
🙏 1