https://delta.io logo
r

Rahul Sharma

05/05/2023, 8:06 AM
hello Team , i am getting below error while reading delta table data using athena external table from manifast loc
j

Jon Stockham

05/05/2023, 8:10 AM
don't use athena table in spark
r

Rahul Sharma

05/05/2023, 8:10 AM
Then how I read external table in spark
Earlier it was running
j

Jon Stockham

05/05/2023, 8:11 AM
why do you need to? just use the delta table itself
r

Rahul Sharma

05/05/2023, 8:12 AM
Bcz my end user want to see the schema of the table but delta doesn’t show columns in Athena
j

Jon Stockham

05/05/2023, 8:13 AM
ok so keep the external table for them, but why do you need to use the external table in your spark code?
r

Rahul Sharma

05/05/2023, 8:14 AM
my existing etl was using the athena table so i was changed the pointing
j

Jon Stockham

05/05/2023, 8:15 AM
Well that's reason for the error
r

Rahul Sharma

05/05/2023, 8:16 AM
but earlier i was able to read the data
j

Jon Stockham

05/05/2023, 8:16 AM
yes, because the catalog was referencing a delta table. now it is referencing an athena compatible external table
r

Rahul Sharma

05/05/2023, 8:17 AM
so is there any way to see delta column metadta in athena
j

Jon Stockham

05/05/2023, 8:18 AM
yes with an external table. they are not mutually exclusive, you can have both
use the native delta table catalog entry for your spark code, have another catalog table which will be your external table
r

Rahul Sharma

05/05/2023, 8:19 AM
Copy code
%%sql
CREATE EXTERNAL TABLE udp.cash(
  `id` int, 
  `status` string, 
  `value` int, 
  `dt` date)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
LOCATION
  '<s3://udp-lakehouse-dev/jwr/cash/_symlink_format_manifest/>'
j

Jon Stockham

05/05/2023, 8:19 AM
and just remember to re-generate manifest whenever you write
r

Rahul Sharma

05/05/2023, 8:20 AM
if i create native table from glue crawler then also i have to re-generate manifest file
?
can you give me command how to create native delta table catalog
?
j

Jon Stockham

05/05/2023, 8:22 AM
you just said this was how you had it before you changed it to an external table
r

Rahul Sharma

05/05/2023, 8:24 AM
i have created by below command
Copy code
%%sql
CREATE EXTERNAL TABLE udp.cash(
  `id` int, 
  `status` string, 
  `value` int, 
  `dt` date)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
LOCATION
  '<s3://udp-lakehouse-dev/jwr/cash/_symlink_format_manifest/>'
is this ok
i have one more doubt if my streaming continuously writing the data into refine zone then how i re-generate manifast file
j

Jon Stockham

05/05/2023, 8:32 AM
I just seem to confuse you more when I try to help, perhaps someone else can explain clearer
r

Rahul Sharma

05/05/2023, 8:43 AM
i have below doubts 1. if my streaming running continuously then how to regenerate my mani-fast file 2. what is the best way to create external delta table
j

Jon Stockham

05/05/2023, 8:50 AM
with code, read https://docs.delta.io/latest/presto-integration.html or you can perform both tasks with glue crawlers
64 Views