Lennart Skogmo

03/04/2023, 9:35 AM
I am wondering if there would be a large penalty to including a column with nothing but empty strings in a delta table or if compression would make it negligible? I'm considering the merit of including standard meta columns although they might sometimes be unused vs adding columns case by case.

JosephK (exDatabricks)

03/04/2023, 4:00 PM
Parquet will store empty strings as a null, so basically 1 bit. Schema evolution/overwrite might be simpler

Lennart Skogmo

03/04/2023, 6:14 PM
Thanks nice to know 🙂
😄 1