https://delta.io logo
s

Shane Torgerson

01/23/2023, 3:20 PM
I have a delta table on azure that I need to copy over the changes nightly to gcp. I am trying to avoid copying over the entire set of delta/parquet files every night and instead only copy over the changes. Any suggestions to do this in an efficient manner? Options that I see: 1. Use Change Data Capture? 2. Somehow parse the transaction logs?
g

Gerhard Brueckl

01/23/2023, 3:23 PM
if you can mount your GCP storage in your Azure Databricks workspace, you could do an incremental DEEP CLONE which is super efficient
👍 1
s

Shane Torgerson

01/23/2023, 3:25 PM
Does that require using a hive table or can that be done natively on delta tables?
g

Gerhard Brueckl

01/23/2023, 3:28 PM
should be doable natively
6 Views