Hi everyone!! At my company, we are trying to use the delta lake with AWS and with Databricks. I am not a fan of Databricks so I do everything with AWS Glue but the compatibility with Delta seems strange. I have a hard time using the spark-delta with Glue 3 (so no interactive notebooks for me) because I don’t seem to be able to set it up. It works fine with Glue 4. But I was wondering how do other people do it? How do you work with delta and Glue? If you use a crawler, what are the advantages of using native tables or creating a symlink? How do you interact with your delta tables? I can use Athena of course but it is read only and does not support time travel. For now this has been fine but it kind of makes one of the cool things about delta a bit unusable (time travel). Maybe I am just doing it all wrong, can anyone comment on their experiences and tricks to work with #deltalake-on-aws ? Thank you 😊
06/17/2023, 6:45 PM
You can use Trino natively, which Athena is kind of based on, and get all the proper features like insert, merge, alter, analyze, vacuum, optimize, etc. You can even use Glue as your metastore.