https://delta.io logo
d

David Schenk

05/08/2023, 12:55 PM
Hey guys, have any of you experience with setting up the dynamodb lock with delta-rs in rust? I mean I have read the documentation and I am sure that as far as I could follow everything is set correctly. But the
RecordBatchWriter
panicked at
flush_and_commit
with an
ObjectStore { source: Generic { store: "DeltaS3ObjectStore", source: LockClientRequired } }
. Do any of you have any ideas or experience with this? In thread I will post some code snippets to better understand how my setup looks like.
Copy code
let mut storage_options = HashMap::new();
    storage_options.insert("AWS_REGION".to_string(), "eu-central-1".to_string());
    storage_options.insert("AWS_PROFILE".to_string(), "the_profile".to_string());
    storage_options.insert("AWS_S3_LOCKING_PROVIDER".to_string(), "dynamodb".to_string());
    storage_options.insert("DYNAMO_LOCK_TABLE_NAME".to_string(), "delta_writer_rust".to_string());
    storage_options.insert("DYNAMO_LOCK_PARTITION_KEY_VALUE".to_string(), "lockID".to_string());

let object_store = DeltaObjectStore::try_new(table_uri.parse().unwrap(), StorageOptions::from(storage_options.clone())).unwrap();
DynamoDB table is created before and PK is set to lockID as String
Ok, seems it has something to do with the storage_options that are provided in the DeltaObjectStore. If I set the configuration parameter es environment variable I get the following panic:
Copy code
ObjectStore { source: Generic { store: "DeltaS3ObjectStore", source: Dynamo { source: GetItemError(Validation("The provided key element does not match the schema")) } } }
But it looks to me that the schema definition is invalid that is used by delta-rs.
Ok, I’ve crawled through the integration tests and found this
create_table
function: https://github.com/delta-io/delta-rs/blob/0115fbb9b1ad1a6a2a1521d3b05f129fdace7327/rust/tests/common/s3.rs#L27 Recreated my dynamodb table with
key
as partition key and code runs without issues. What am I missing here?
r

rtyler

05/08/2023, 3:59 PM
@David are you just trying to use locking for your own writes? That is all done for you when the appropriate environment variables are set. You should not need to implement any locking yourself.
DYNAMO_LOCK_TABLE_NAME
set and
AWS_S3_LOCKING_PROVIDER
set to
dynamodb
will cover it. Here's some terraform that sets up the dynamo table properly: https://github.com/buoyant-data/lambda-delta-optimize/blob/main/deployment.tf#L113-L127
🙌 1
(there are some environment variables can be overwritten to change the name of the table and the hash key, but this is the default)
d

David Schenk

05/09/2023, 6:50 AM
Hey @rtyler, thank you this is exactly what I was searching for. Just tried to set the PK to something I know since it is not mentioned in the docs (all other parameters have a documented default value). I figured it out by reading the integration tests but adding it to docs might safe some hours of researching 😄
6 Views