https://delta.io logo
s

Sadiq Kavungal

07/03/2023, 8:14 AM
Could you please suggest a solution? We have a 20GB customer persona in the delta table, and our i*n-house application needs to retrieve customer persona data through an API call*. We require the results to be obtained <*1 second*. Is it possible to create an API on top of the Delta Table for this purpose?
j

John Darrington

07/03/2023, 1:22 PM
Wouldn't delta sharing be the protocol you want? https://www.databricks.com/product/delta-sharing
if not you can always stand up spark, query spark and spark queries delta - or write your own api that uses the python or rust package for reading delta directly
s

Sadiq Kavungal

07/03/2023, 2:05 PM
thanks f_or the reply. Can we use Delta sharing in_ On Premise
🙏 1
s

shingo

07/03/2023, 3:24 PM
Though this is a reference implementation, you can deploy the server on-premise: https://github.com/delta-io/delta-sharing
j

John Darrington

07/03/2023, 5:51 PM
isn't there someone working on a Rust delta sharing server too?
s

shingo

07/03/2023, 6:10 PM
j

John Darrington

07/03/2023, 6:48 PM
😄
👍 1
s

Sadiq Kavungal

07/04/2023, 5:54 AM
❤️🙂
u

陳冠穎

08/08/2023, 8:10 AM
@shingo Could we set up a Rust delta sharing server in an on-premises environment and read on-premises HDFS paths? Thanks for your answer! 😍
s

shingo

08/23/2023, 6:59 PM
@陳冠穎 Thank you for your mention and suggestion! Unfortunately, the current implementation does not support on-premise object stores since the Delta Sharing protocol relies on the secure pre-signed short-lived URLs for the data sharing, which is not available in HDFS (according to my knowledge). You can implement your own pre-signed short-lived URL generator module for on-premise (HDFS-based) data sharing but I believe there are some additional considerations and it will be another crate apart from the delta-sharing-rs.
🙆‍♀️ 1
2 Views