https://delta.io logo
t

Tuan Nguyen

05/27/2023, 12:12 AM
Hey folks. I have a delta table, generated in Databricks with proper schema. However, Athena & Glue only recognize the schema as a single column of type
array<string>
. Has anyone had this problem before? Running `SHOW COLUMNS IN table_name`in Athena shows a list of column names as expected.
r

Rahul Sharma

05/27/2023, 9:53 AM
same problem i faced early but there is two solution 1. try to create delta table using below command
Copy code
CREATE EXTERNAL TABLE delta_test.entrypoints_test
LOCATION '<s3://jgdp-lakehouse-dev/raw-ingestion/jwr/entrypoints_test/>'
TBLPROPERTIES (
'table_type'='DELTA'
);
or you can use glue native query crawler i am gonna sharing code with u
native.py
provide all the required parameter into function and crawler will run
glue crawler automatically create table under db name with destination location
c

Carl Mattsson

05/29/2023, 5:21 AM
We had this issue for a long time at a client that I work with and raised it to Databricks. Recently, we were instructed to add this to the Spark config of our Databricks clusters, and now the schema is stored properly in Glue:
Copy code
spark.databricks.delta.catalog.update.hiveSchema.enabled true
Some additional Glue permissions might be needed.
t

Tuan Nguyen

05/29/2023, 8:23 AM
Thanks guys. Suggestion from Carl worked 🎉 Btw, my Delta table was created via Databricks DLT. If I specified a target schema for the DLT, not only were the delta tables NOT created properly in Athena/Glue (i.e., only a single column), but also I couldn’t use the command that Rahul provided above to create a Delta table manually. Got some errors in Athena. If I didn’t specify a target schema, I was able to manually create a Delta table with
Copy code
CREATE EXTERNAL TABLE delta_test.entrypoints_test
LOCATION '<s3://some_locations>'
TBLPROPERTIES (
'table_type'='DELTA'
);
With the config provided by Carl, I could specify a target schema in DLT and it worked.
👍 2