https://delta.io logo
n

Naveen Kumar Vadlamudi

06/17/2023, 6:03 PM
Hello All, I am a Graduate Student Currently Researching on Delta Lake, I have few questions regarding the lakehouse architecture. 1) Can we consider Lakehouse as third generation Cloud Data Warehouse ? if so why didnt the authors mention, that point explicitly over the delta lake .. https://www.vldb.org/pvldb/vol13/p3411-armbrust.pdf Nor in the following one https://cs.stanford.edu/~matei/papers/2021/cidr_lakehouse.pdf Therefore, can anyone from team confirm, is this inference valid or is it, any different from my perception. Sorry for positng this question over here, please neglect if this academic question is irrelevant.
d

Denny Lee

06/18/2023, 12:36 AM
Hi @Naveen Kumar Vadlamudi - different people define what the different waves for business data systems. Coming from a background on helping to build relational database / DW [v1] systems (in my case, Microsoft SQL Server), then building data lake [v2] systems (personally, Hadoop, HDInsight, and Spark), and then building lakehouse systems [v3] (e.g. fast query engines like Spark 3.0 and then Photon + Delta Lake) - yes, I almost agree with your assessment.
Note, the examples I'm using are from my own personal perspective - others will disagree (e.g. horizontal + vertical scale systems ala HPC).
The reason, I'm saying almost is because the context of lakehouse is the best of both worlds of the "data warehouse" [v1] + "data lake" [v2]. Succinctly, take the simplicity, reliability, and structure of a data warehouse and combine this with scalability and extensibility of a data lake [v2]. While commonly associated with the cloud, the lakehouse does not actually require the cloud. As well, cloud data warehouses themselves often lack the one or many of the components of a lakehouse. Saying this, I"m sure there are valid debates on we way or another related to all of this and perhaps we can do another AMA discussion for this as this answer is probably already too long, eh?!
n

Naveen Kumar Vadlamudi

06/18/2023, 1:01 AM
I just need to keep the authors perspective as of now, because i am summarizing the work. But in future i might reuse this terminology for my own research, when citing my further work on this. So, thats the reason i would like to confirm from the team of experts that has contributed towards the project. btw thanks for the detailed response @Denny Lee, it helps me a lot.
d

Denny Lee

06/18/2023, 1:06 AM
Ah, while I cannot speak for the authors themselves, this perspective is inline with their work.
1
👍 1