https://delta.io logo
l

Lennart Skogmo

03/08/2023, 7:08 PM
I'm experimenting with Databricks and Delta. Noticed that spark.sql() operations can return a dataframe with affected rows. Doing a saveAsTable instead does not seem to return a similar result dataframe. Does anyone know of some tricks to get the same information from delta api or spark itself with low overhead?
j

JosephK (exDatabricks)

03/08/2023, 7:13 PM
spark.sql returns a dataframe. saveAsTable doesn’t return anything, but writes a delta table to disk and registers it with the metastore. You can do a describe history and look at the history to see how many rows are changed
l

Lennart Skogmo

03/08/2023, 7:17 PM
Nice 🙂 Thanks for sharing your knowledge again.
If someone else was wondering how to do it with pyspark this seems to be it:
Copy code
from delta.tables import DeltaTable
table = DeltaTable.forName(spark, "database.table")
operation = deltaTable.history(1)
operation.display()
j

JosephK (exDatabricks)

03/08/2023, 8:00 PM
display(spark.sql(“describe history database.table”)) will work too
👍 1
2 Views