https://delta.io logo
s

Sadiq Kavungal

07/04/2023, 7:58 AM
I have a question. We are currently implementing delta-rs and attempting to read a delta table using Python, where the data is stored in HDFS. However, we haven't come across any documentation regarding the connection between Delta-rs and HDFS. Could you please assist me in finding any relevant references or resources?
🔥 1
j

Jacek

07/04/2023, 8:56 AM
(DISCLAIMER: I’ve never used delta-rs)
If the table is registered in catalog, it’s just a name, so there should be a way to access the table using the name
Have you tried an URL with
hdfs://
or similar?
In other words, if it was not in HDFS, do you know how to access a delta table using delta-rs?
(hoping to learn a bit while helping you 😉 )
s

Sadiq Kavungal

07/04/2023, 9:17 AM
thanks for the reply, I have tried the delta-rs with Python and we could able to read the delta table which is in the Local. but while connecting to the HDFS not getting. In the delta-rs document, I could see the connection configurations for s3, Azure, and GCP but not for HDFS.
j

Jacek

07/04/2023, 9:17 AM
can you share the doc? I’d like to have a look (and perhaps figure out what the URL should be like for hdfs files)
s

Sadiq Kavungal

07/04/2023, 9:20 AM
j

Jacek

07/04/2023, 9:20 AM
the basic service provider is derived from the URL being used
deltalake will work with any storage compliant with pyarrow.fs.FileSystem, however the root of the filesystem has to be adjusted to point at the root of the Delta table.
can you check
hdfs://
URL?
s

Sadiq Kavungal

07/04/2023, 9:24 AM
Please let me check.
❤️ 1