https://delta.io logo
#random
Title
s

Sai Santhosh Tallapally

04/21/2023, 8:22 PM
Hello I am running EMR cluster and moving delta jar file from S3 to
/usr/lib/spark/jars/
through bootstrap actions. I am able to read/write delta files. But, when I do
import delta
my job is failing with
ModuleNotFoundError: No module named 'delta'
Any help is much appreciated 🙂
r

Ryan Zhu

04/22/2023, 5:10 PM
You are missing delta python package. You can run
pip install delta-spark
to install it. Otherwise, you cannot use Delta’s python API.
s

Sai Santhosh Tallapally

04/24/2023, 7:35 PM
Thank you @Ryan Zhu I have updated my bootstrap script to install
delta-spark=1.0.0
package. I have few questions-- 1. I am running into issue once I install delta-spark package
Copy code
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
2. delta-spark=1.0.0 requires pyspark>3.1.3. But, I am using spark=3.0.1 This might run into incompatibility issues.