Channels
delta-social
h2g2-streaming
roapi
delta-oss
dais-2023
random
delta-rs
general
jobs
kafka-delta-ingest
deltalake-questions
events
lakefs
delta-cycling
deltalake-databricks-dbsql
delta-community
azuredatabricks
deltalake-on-aws
delta-sharing
flink-delta-connector
graph-delta-lake
dat
delta-sharing-rs
Powered by
#deltalake-questions
Title
a
Ajex
03/10/2023, 8:04 AM
Hello everyone. Recently, we have received a data stream from Kafka, which contains several fields in the form of text with very large lengths. I was wondering if anyone has any solutions to optimize storage for this data.
j
JosephK (exDatabricks)
03/10/2023, 11:46 AM
Parquet, which Delta uses, supports compression. Snappy is the default, but you can change this when you save the files or at the cluster level.
https://spark.apache.org/docs/latest/sql-data-sources-parquet.html#data-source-option
✅ 1
a
Ajex
03/13/2023, 6:39 AM
thank you
3 Views
Post