Eyal Perry
08/20/2023, 4:02 PM<http://DataFrame.as|DataFrame.as>[T]
to cast the DataFrame to a case class of the table schema.
Is there some configuration like csv/json reader's mode:DROPMALFORMED
or columnNameOfCorruptRecord
which can natively ignore the specific record instead of just throwing an error which terminates my application?
The ideal would be to emit metrics whenever a corrupt record is encountered, and be able to filter out these records by returning null, which results in a more robust application.
I could always code this with Try + flatMap + transform, but would rather see a more native, performant solution.
P.S. not running on databricks and this is not going to change, so I would appreciate a native spark solution