https://delta.io logo
m

Matt Moen

06/20/2023, 10:24 PM
When I run optimize on a delta lake table, I can specify a target file size like this:
Copy code
spark.conf.set("spark.databricks.delta.optimize.maxFileSize", 256*1024*1024)
Is there an equivalent setting for the target file size when writing an initial dataset, for example from
df.write
?
r

rtyler

06/20/2023, 10:46 PM
My understanding is that this can be configured by
delta.targetFileSize
on the table properties https://docs.databricks.com/delta/tune-file-size.html#set-a-target-file-size
m

Matt Moen

06/20/2023, 10:55 PM
Thanks for the reply. I'm pretty sure I tried that one and got an error that it's not a supported table option. Databricks specific i think
r

rtyler

06/20/2023, 10:56 PM
hrm, that may be the case, I don't see it here https://docs.delta.io/latest/table-properties.html I was under the impression they had landed some of those proprietary table properties more recently, sorry
a

abhijeet_naib

08/30/2023, 12:47 PM
Copy code
spark.conf.set("spark.databricks.delta.optimize.maxFileSize", 256*1024*1024)
this works post zorder but not post optimize
3 Views