Hanan Shteingart
01/15/2023, 5:38 PMAn error occurred while fetching table: dsm09collectx
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at,<s3://nbu-ml/projects/rca/msft/dsm09collectx/delta/_delta_log>
but you are trying to read fromusing format("parquet"). You must use<s3://nbu-ml/projects/rca/msft/dsm09collectx/delta>
'format("delta")' when reading and writing to a delta table.
To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://docs.databricks.com/delta/index.htmlWhat is the issue and how can I solve it? when I read the data in Redash I also cannot parse the data. When I read the table using
spark.table(table_name)
it works fine.
The code generating the delta table:
spark.readStream
.format("cloudFiles")
.option("header", "true")
.option("cloudFiles.partitionColumns", "date, hour")
.option("cloudFiles.format", "csv")
.option("cloudFiles.schemaHints", SCHEMA_HINT)
.option("cloudFiles.schemaLocation", checkpoint_path)
.option("cloudFiles.schemaEvolutionMode", "addNewColumns")
.load(file_path)
.select("*", input_file_name().alias("source_file"), current_timestamp().alias("processing_time"))
.writeStream
.option("checkpointLocation", checkpoint_path)
.option("path", output_path)
.trigger(availableNow=True)
.toTable(table_name))
delta.DeltaTable.isDeltaTable(spark, TABLE_NAME)
so I have added format("delta");
writeStream
.format("delta")
.option("checkpointLocation", checkpoint_path)
.option("path", output_path)
.trigger(availableNow=True)
.toTable(table_name))
But it didn't help (I have VACCUM the table, and dropped it. I have checked the checkpoint and delta path are empty).Yousry Mohamed
01/17/2023, 9:49 AMcheckpoint_path
for schema checkpoint and streaming checkpoint. They are different things and should live in different folders.