https://delta.io logo
r

Robert Thompson

04/04/2023, 4:01 PM
curious is anyone thinking of the ability to mask a column in delta? i know you can do it at the compute level using unity and databricks. But that doesn't apply if you read the table from a rust endpoint.
l

Lennart Skogmo

04/04/2023, 4:04 PM
Encryption could be a solution then. That would allow you to read the data without being any wiser.
r

Robert Thompson

04/04/2023, 4:07 PM
i think i already know the answer is just create 2 tables so i can put the security context on the whole table. But was just curious if anyone was thinking about column level security as has a construct of the table instead of compute.
r

rtyler

04/04/2023, 4:09 PM
We're going the "two table" route basically. Duplicate storage is a cheap and easy to manage compared to juggling encryption keys for different workloads 😛
l

Lennart Skogmo

04/04/2023, 4:09 PM
I'm not an expert but if you start peeling away and circumventing layers of the system that are protecting the data, it would seem that in the end you can only rely on encryption as that is the only thing that can protect raw data.
r

Robert Thompson

04/04/2023, 4:13 PM
i think i see what you are saying encrpt the column and if they have the correct creds they could have access to the decrypt key
r

rtyler

04/04/2023, 4:16 PM
@Lennart Skogmo I think it really depends on the "threat model" for trying to mask, encrypt, or otherwise protect data. In the context of S3, yes you can encrypt everything at rest, but practically speaking IAM based access controls on the data which allows for specifying access grants to prefixes in S3 buckets will prevent most forms of undesirable data access. I'm still skeptical of column level access controls for a myriad of reasons, I have had people working with me state unequivocally that if we go down the rabbit hole of column level RBAC they would be leaving the organization 😆
l

Lennart Skogmo

04/04/2023, 4:25 PM
@rtyler I agree. I was mostly just following the idea of minimal system support for masking data to its logical conclusion 😛 I actually never tried to implement a solution based on value encryption nor do I have any strong opinions on it. One cool aspect that I do however like the idea of, is the ability to effectivly delete data without updating files later on by deleting decryption keys stored seperatly. It can turn your system into write once if writes are costly and unconvenient 😛
🙃 1
j

Jim Hibbard

04/04/2023, 5:43 PM
In the past I've either duplicated the whole table or had a second table to join on that had encrypted values. Both worked well 🤷‍♂️
r

Robert Thompson

04/10/2023, 7:19 PM
So just wondering @Lennart Skogmo do you have any experience encrypting using the rust delta writer?
2 Views