https://delta.io logo
a

Afonso de Paula Feliciano

08/31/2023, 6:47 PM
Hey guys, do you have any best practices for inserting data from several different users simultaneously into a delta table? For example, a delta table partitioned by country where each user from their respective country performs insert operations, only append operations
Copy code
Concurrency Level
Delta Lake supports concurrent reads and append-only writes. To be considered as append-only, a writer must be only adding new data without reading or modifying existing data in any way. Concurrent reads and appends are allowed and get snapshot isolation even when they operate on the same Delta table partition.
in my scenario, each user in each country needs to run a select first to find out the longest existing date, and then perform the insert operation in append mode.
v

Vincent Chee

09/01/2023, 9:51 AM
1. You might run into situation where largest existing date not the “largest” due to snapshot isolation. Imagine, a writer is updating the existing date while another writer process is reading it. 2. Given your app can tolerate eventual consistency i.e. your app logic won’t fail if largest existing date may sometimes be not the “largest”, i dont see much issue with this. Concurrency in data lake still has its limitation.
a

Afonso de Paula Feliciano

09/01/2023, 7:25 PM
@Vincent Chee thank you for your answer, but the isolation in this case would not be a problem, right? considering that my data is unique for different partitions and each user only run queries for your specific partition. So, it's not possible a user from mx write a new max date for br partition.
d

Dominique Brezinski

09/01/2023, 10:29 PM
concurrent appends are not a problem, so long as the come from the same driver or multi-cluster support is enabled (default in Databricks). There are no conflict in pure appends.
v

Vincent Chee

09/02/2023, 2:39 AM
I completely misunderstood that you are query for the partition largest value. Yea, conflict wouldn’t occur in this case.
a

Afonso de Paula Feliciano

09/04/2023, 12:19 PM
Thanks folks for your answers