https://delta.io logo
e

Evandro Lippert

06/19/2023, 11:47 AM
Hey, I recently posted a question on Stack Overflow regarding an issue I'm facing with the
delta-rs
library. I'm trying to create a Delta table based on a Parquet file. Here's what I'm doing: First, I load the Parquet file using Arrow and extract its schema. I then attempt to create a Delta table, aiming to mirror the Parquet's schema as closely as possible. However, when I attempt to insert the data from the Parquet file into the Delta table, I encounter the following error:
Copy code
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Generic("Updating table schema not yet implemented")
I tried to follow these two texts here: 1, 2, but I'm not able to solve it. I haven't found a straightforward way to create the Delta table directly using the Parquet file. I would greatly appreciate any assistance or pointers on how to resolve this issue. Here is the link to the Stack Overflow question for more details and the code: https://stackoverflow.com/questions/76506007/error-when-trying-to-generate-a-delta-table-from-a-parquet-file-using-delta-rs-l Thank you in advance for your help.
y

Yousry Mohamed

06/20/2023, 10:45 PM
Using Spark, try to print the schema of parquet file and schema of the created delta table and compare them first.
e

Evandro Lippert

06/21/2023, 5:58 PM
Hello! I believe I've found the problem. In the Parquet file, the timestamp is in nanoseconds, but when I created the columns in the Delta table, they were set to microseconds by default. I think this discrepancy was causing the issue. When I removed this column, the lines were written successfully. Now, the simplest solution might be to change the datatype in the Parquet file. P.S. Your texts are really good; they've helped me a lot. Thank you!
y

Yousry Mohamed

06/21/2023, 10:27 PM
Thank you Evandro. Funny thing I have a similar issue at my current client due to parquet files created using arrow in R hence timestamps are stored as INT64 new mode while Azure data factory struggles to handle them let alone the timezone issues 😂
😂 1
e

Evandro Lippert

06/22/2023, 7:51 PM
Here, I'm just trying to replace the spark to handle the ingestion from the landing to bronze layer. But I'm really new in Rust, so I'm struggling to do that.
2 Views