https://delta.io logo
s

Simon Thelin

07/14/2023, 10:50 AM
Can one use
withWatermark()
with
delta
cdf
?
d

Dominique Brezinski

07/14/2023, 6:36 PM
I have never tried, but watermarks are a Spark Structured Streaming feature, so they should be independent and work together. However, what is your use case? Like you should probably be doing a merge from the CDF rather than an aggregation etc. with watermark.
s

S Thelin

07/16/2023, 9:56 AM
Yeah exactly what I thought. In my case, I have a CDF app which acts a stream, but it will run for a certain period and then gracefully shutdown. I then want it to continue from the time it shutdown, so what I can do is to create a
TS
, which I store and provide to the
stream
job since CDF supports read
startingTimestamp
But I then had some thoughts around, if the app dies in the middle of the compute, or similar, but maybe that is not a big issue.
This job is not doing massive aggregation but rather just pushing forward changes and doing validations. Minimal window function on the input.