Hanan Shteingart

01/15/2023, 5:38 PM
I have created a delta table using "AutoLoader" yet when I am trying to look at the data in the "Data" tab it says:
An error occurred while fetching table: dsm09collectx
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at
but you are trying to read from
using format("parquet"). You must use
'format("delta")' when reading and writing to a delta table.
To disable this check, SET
To learn more about Delta, see
What is the issue and how can I solve it? when I read the data in Redash I also cannot parse the data. When I read the table using
it works fine. The code generating the delta table:
Copy code
  .option("header", "true")
  .option("cloudFiles.partitionColumns", "date, hour")
  .option("cloudFiles.format", "csv")
  .option("cloudFiles.schemaHints", SCHEMA_HINT)
  .option("cloudFiles.schemaLocation", checkpoint_path)
  .option("cloudFiles.schemaEvolutionMode", "addNewColumns")
  .select("*", input_file_name().alias("source_file"), current_timestamp().alias("processing_time"))
  .option("checkpointLocation", checkpoint_path)
  .option("path", output_path)
I see the created table is not delta by running
delta.DeltaTable.isDeltaTable(spark, TABLE_NAME)
so I have added format("delta");
Copy code
  .option("checkpointLocation", checkpoint_path)
  .option("path", output_path)
But it didn't help (I have VACCUM the table, and dropped it. I have checked the checkpoint and delta path are empty).

Yousry Mohamed

01/17/2023, 9:49 AM
It seems you are creating an external table hence dropping the table will not drop the parquet and log files in the table location. Try to start fresh and drop the table plus drop the bucket folders (for the table and checkpoint and schema checkpoint). I see that you you same location
for schema checkpoint and streaming checkpoint. They are different things and should live in different folders.