https://delta.io logo
s

Stefano Lori

02/14/2023, 7:31 AM
Hi all, I’m running issues when trying to consume from Kafka using structured streaming and writing data on delta lake. The delta libraries are loaded in a docker image I created to run Spark on K8s using the spark operator. The problem appears when we put the following dependency in the pom:
Copy code
groupId = org.apache.spark
artifactId = spark-sql-kafka-0-10_2.12
version = 3.3.1
The application runs in local without problems, but when distributed over a K8s cluster, it breaks with the exception
Copy code
Caused by: java.lang.ClassNotFoundException: delta.DefaultSource
Anybody experienced something like this? thx Logs here:
Copy code
2023-02-13 16:40:37.990Z  INFO  org.apache.spark.storage.BlockManagerMaster:61 - BlockManagerMaster stopped
2023-02-13 16:40:37.996Z  INFO  org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:61 - OutputCommitCoordinator stopped!
Exception in thread "main" java.lang.ClassNotFoundException:
Failed to find data source: delta. Please find packages at
<https://spark.apache.org/third-party-projects.html>

	at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:587)
	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:675)
	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:725)
	at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:864)
	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:256)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
	at ....
	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
	at com.izicap.delta.loader.DeltaLoaderTableLauncher$.run(DeltaLoaderTableLauncher.scala:155)
	at com.izicap.delta.loader.DeltaLoaderTableLauncher$.main(DeltaLoaderTableLauncher.scala:178)
	at com.izicap.delta.loader.DeltaLoaderTableLauncher.main(DeltaLoaderTableLauncher.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: delta.DefaultSource
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:661)
	at scala.util.Try$.apply(Try.scala:213)
	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:661)
	at scala.util.Failure.orElse(Try.scala:224)
	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:661)
	... 32 more
2 Views