https://delta.io logo
c

Christina

04/11/2023, 10:45 PM
is there a way to write a small dataset to delta without spark or a spark cluster? if yes, what language support that? i know pandas can insert records directly into a delta table, but it still needed to connect to spark cluster.
r

rtyler

04/11/2023, 10:48 PM
Yes we can support that through the native
deltalake
package available in Python.
🎉 1
👀 1
j

Jim Hibbard

04/11/2023, 10:57 PM
The deltalake package is currently the best way to integrate with Delta Lake from any library when you don't want to use Spark 🙂
👍 1
c

Christina

04/11/2023, 11:57 PM
Write to a Delta Lake table Note that this function does NOT register this table in a data catalog. Does that require a separate step for the table to show up in Hive metastore? Will it work if we use Unity Catalog or do we need to later migrate this table
j

Jim Hibbard

04/12/2023, 12:06 AM
There is some catalog support in the
deltalake
package for Glue currently. I'd definitely expect to see more catalog support in the future though, delta-rs is a very active open source project! You should definitely open a feature request or comment on an existing one to share what type of Unity/catalog support you'd like to see. Example Glue catalog code from the docs:
Copy code
from deltalake import DeltaTable
from deltalake import DataCatalog

database_name = "simple_database"
table_name = "simple_table"
data_catalog = DataCatalog.AWS
dt = DeltaTable.from_data_catalog(data_catalog=data_catalog, database_name=database_name, table_name=table_name)
dt.to_pyarrow_table().to_pydict()
{'id': [5, 7, 9, 5, 6, 7, 8, 9]}
10 Views