https://delta.io logo
j

Jared Grove

06/18/2023, 9:03 PM
I have an issue with delta that is only reproducible inside a docker container. When I run
spark-submit
on my local host I have no errors. However, I wanted to test this spark application in docker, which I guess is still technically my local host, but I want to submit the program from the docker container. I have four containers. spark-master, two spark-workers, spark-history-server, and a spark-driver. All containers are on the same docker network. Inside the spark-driver container is where I launch
spark-submit --properties-file ./src/spark/spark-defaults.conf ./src/start_pipeline.py
I receive the following error
An error occurred while calling o515.load.
: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 11.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11.0 (TID 156) (172.23.0.3 executor 3): org.apache.spark.SparkFileNotFoundException: File file:/opt/ufo-lakehouse/lakehouse/ufo/bronze/_delta_log/00000000000000000000.json does not exist
It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.
This file does exist, the spark program created it! I first thought it may be permission issues so I set all my folders/files with permission 777 but still have the same error. Any help or guidance would be much appreciated. Thank you!
r

Rahul Sharma

06/19/2023, 4:10 AM
can you confirm this file is exist in the log directory ?
j

Jared Grove

06/19/2023, 2:09 PM
Hello Rahul, Yes I can confirm the log directory exists inside the docker container. Here is a snapshot
r

Rahul Sharma

06/20/2023, 4:15 AM
@Matthew Powers can you please check ?
j

Jared Grove

06/20/2023, 2:38 PM
I'm not sure if this is a bug and if I should post something under issues in the github repo or if its something wrong with my dockerfile / docker-compose. I thought maybe it was a permission issue but I tried all sorts of things with permissions in the container but can not get it to find the delta_logs. It seems spark creates the the first delta_log successfully when the tables are created but then cant find it when I go to perform data transformations on the table.
I even set the UID in my docker file to 185 which I read was the official UID for spark. I'm using the bitnami/spark:3.4.0 image and pyspark 3.4.0 and delta_spark 2.4.0.