Hello Everyone,
I am currently facing a challenge with Delta tables regarding how to design the write flow. I am working on integration between an Open Source ML monitoring project developed by my company and Delta lakehouses.
The solution currently support Influxdb for data storage, loose schema requirement in InfluxDB due to the way it stores the data makes it satisfying solution to handle such cases.
We want to also leverage Delta tables for data storage.
I am having a hard time wrapping my head around the best way to leverage delta tables for time-series where the schema/metadata for every ingested timeseries dataframe/chunk changes depending on the ML model version/meta-data that might from a model version to another. The schema enforcement in Delta is a disadvantage in this case, how to work with that feature? is it a good idea to create a new table when schema/metadate changes? Any other course of action? What are your suggestion to deal with Delta specificities?
I am pretty new to the delta eco-system, and the learning curve is pretty steep. I am always happy to answer any concerns and give clarity on what I am trying to achieve,
Thank you for your help!