https://delta.io logo
r

rtyler

04/13/2023, 2:51 PM
@Will Jones For some more Delta related context on this question you got mentioned in, I basically cannot run our delta-rs optimize on Spark created parquet files with a
primitive("timestamp")
column in the schema because the ParquetReader gives
Timestamp(nanosecond)
from the int96 column, but the Arrow schema delta-rs thinks it should be using is
Timestamp(microsecond)
, which I believe to be the correct interpretation here. My gut feel is that this will be the case for parquet files written by Delta/Spark, but I'm not sure where the best place to introduce the necessary conversion will be. Putting it in our writer feels correct but I think we would have other readers not doing the right thing on this type either
w

Will Jones

04/13/2023, 3:04 PM
Why put the conversion in the writer and not in the reader?
The one thing I’m not sure of: does Spark and other engines care if we write back files with microsecond timestamps if there already exist files with int96 nanosecond timestamps?
r

rtyler

04/13/2023, 3:05 PM
In the case of optimize, the reader is just the reader from the Parquet crate itself. I don't know how we'd appropriately convert there since this seems like a delta problem
I was hoping to get a pointer to where the parquet write happens in the delta/spark implementation since I wasn't able to find it last night
w

Will Jones

04/13/2023, 3:12 PM
I think on the reader side, you could cast each record batch as it gets read
r

rtyler

04/13/2023, 3:16 PM
At least for optimize specifically that would be here but I think this would be a generalized problem no?
A colleague pointed me to https://github.com/delta-io/delta-rs/issues/441 which is also in the same vein, which you also commented on 😆
w

Will Jones

04/13/2023, 4:00 PM
Yes the general problem is why I created this issue: https://github.com/apache/arrow-datafusion/issues/5950
IMO PyArrow gets this right: you can pass the Arrow schema you want and the reader will do it’s best to read into / cast to that schema.
r

rtyler

04/13/2023, 7:04 PM
What do you think about mimicking Spark's functionality and just writing INT96 by default for timestamp? 😒
w

Will Jones

04/13/2023, 7:11 PM
Do newer versions still do that? I thought they had deprecate and gotten rid of it in newer versions
or maybe I am mixing that up
• Since Spark 3.0, parquet logical type
TIMESTAMP_MICROS
is used by default while saving
TIMESTAMP
columns. In Spark version 2.4 and earlier,
TIMESTAMP
columns are saved as
INT96
in parquet files. To set
INT96
to
spark.sql.parquet.outputTimestampType
restores the previous behavior.
IMO writing int96 should be unnecessary. I think it’s the read that we need to fix, right?
unless Spark can’t read tables that are a mix of the two timestamp types. I haven’t tested that
r

rtyler

04/13/2023, 7:35 PM
unfortunately this parquet file was written by Spark 3.2 last week 🤦 I need to figure that out 🙁
r

Robert

04/13/2023, 7:38 PM
I would agree, that passing a schema to the reader and the reader trying to cast all batches to that schema seems like the most desirable way. Particularly, within an operation the only thing we really have full control of is the read path, since we cannot control all writers, right?
r

rtyler

04/13/2023, 8:13 PM
The "reader" in this case is the ParquetRecordBatchReader so if we wanted to address this we would have to be modifying that API or providing an alternative. With RecordBatch we can allegedly override the schema .with_schema but I'm not sure just yet what that might do with this data set 🤔
w

Will Jones

04/13/2023, 8:25 PM
I don’t think
with_schema
will work; it looks like it only handles metadata and will fail if data types change. I think we’ll need to write our own function that casts record batches by traversing the schema and applying functions from https://docs.rs/arrow-cast/37.0.0/arrow_cast/cast/fn.cast.html
IMO we should submit that function upstream later if we can.
r

rtyler

04/13/2023, 8:27 PM
heh, I was just looking at that. It was introduced in arrow 37 too which would mean I'd have to finish our upgrade to use it anyways. What hell 😆
3 Views