https://delta.io logo
j

João Pinto

01/31/2023, 2:47 PM
Hi everybody! I'm looking for some help in a specific scenario that I have to delete physically parquet files and replace the previous ones in a Delta Table. But the issue is related to the Delta Log, Once I remove just one parquet I get an inconsistency error between Delta Log and the actually Parquet files I have in certain partition. There is any way to rebuild/refresh the Delta metastore/log with new files externally added/removed like we do in Hive with just "REFRESH TABLE" command? I already tried this command with Delta, but with no success 😕 Thank you in advance!
m

Matthew Powers

01/31/2023, 3:26 PM
You should delete the rows in the data that need to be removed with the
delta_table.delete()
command as explained here: https://delta.io/blog/2022-12-07-delete-rows-from-delta-lake-table/ You can physically remove the files from your Delta table with the vacuum command: https://delta.io/blog/2023-01-03-delta-lake-vacuum-command/ Delta Lake doesn’t support legacy Hive commands for reasons explained in this post: https://delta.io/blog/2023-01-18-add-remove-partition-delta-lake/
j

João Pinto

01/31/2023, 3:37 PM
Thanks for your explanation @Matthew Powers!
👍 1
3 Views