I am currently learning about Delta Lake. I would like to utilize Delta Lake and Spark SQL for on-premises purposes. Specifically, I have CSV files stored in my Azure Blob Storage, and I would like to load these files into my Delta tables on-premises.
Could you please provide me with guidance on where to begin, as I would like to avoid wasting time and effort by aimlessly searching for information?
04/19/2023, 6:18 AM
If this is for learning purposes, you can write Delta Tables to your local filesystem. This wouldn't scale for a production deployment but the code you write will have the same logic.
I'd also suggest starting with something like Delta Lake's Dockerfile. Just build the image and you can use a containerized environment for testing/learning!