https://delta.io logo
s

Shane Torgerson

01/23/2023, 3:32 PM
Another separate but related question, does enabling CDC require a metastore or can it be done with just the delta table structure?
j

JosephK (exDatabricks)

01/23/2023, 3:33 PM
no metastore required. It just saves another directory of changes (in parquet) in the directory
s

Shane Torgerson

01/23/2023, 3:34 PM
I am having trouble finding examples of how do this this with the scala api vs using sql.
j

JosephK (exDatabricks)

01/23/2023, 3:35 PM
just a conf, let me find it
delta.enableChangeDataFeed is the conf, but that does it for all tables. I’ll have to dig to find for a specific table
🙌 1
s

Shane Torgerson

01/23/2023, 3:41 PM
Thank you! That would be VERY much appreciated. 🥇
j

JosephK (exDatabricks)

01/23/2023, 4:20 PM
OK, I found some code from 2 years ago and that I thought was python and it’s just sql 😞
You might be able to do the alter table command with delta.
path
instead of the table name and avoid the metastore
Copy code
spark.sql("""ALTER TABLE delta.`path` SET TBLPROPERTIES (delta.enableChangeDataFeed = true)""")
I’d try that or else looks like you’ll need the metastore
s

Shane Torgerson

01/23/2023, 9:04 PM
This works!
Copy code
ds.write().format("delta").option("delta.enableChangeDataFeed",true).mode(SaveMode.Overwrite).save(path)
🎉 1
Another way to create a CDC table without SQL!
Copy code
import io.delta.tables.DeltaTable

DeltaTable.createOrReplace(spark)
            .tableName("foo")
            .addColumn("bar","STRING")
            .addColumn("baz","STRING")
            .property("delta.enableChangeDataFeed","true")
            .location(path)
            .execute()
👌 1
👌🏽 1
k

Kashyap Bhatt

01/23/2023, 9:30 PM
In hindsight, "it's obvious" (after reading your post @Shane Torgerson) that you just need to set the property (
CREATE TABLE ... TBLPROPERTIES (delta.enableChangeDataFeed = true)
) and the way to do that in python is
DeltaTable....property("delta.enableChangeDataFeed","true")
. But I also share the general gripe that many many pages only show SQL examples when you're dying to get the python equivalent.
2 Views