https://delta.io logo
s

S Thelin

04/15/2023, 5:04 PM
Hello. I was trying to test delta-rs
0.8.1
with a local
k8s
MinIO
. I get
Copy code
thread '<unnamed>' panicked at 'not stream', /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/object_store-0.5.5/src/aws/credential.rs:173:27
Copy code
File "/Users/simon/Library/Caches/pypoetry/virtualenvs/heureka-MBcEyvac-py3.9/lib/python3.9/site-packages/deltalake/table.py", line 122, in __init__
    self._table = RawDeltaTable(
pyo3_runtime.PanicException: not stream
Copy code
storage_options = {"AWS_ACCESS_KEY_ID": os.environ["AWS_ACCESS_KEY_ID"],
"AWS_SECRET_ACCESS_KEY": os.environ["AWS_SECRET_ACCESS_KEY"],
"AWS_REGION": aws_region,
"AWS_ENDPOINT_URL": "localhost:30000"  # I also tried, trino-minio-svc.trino:9000, and host.docker.internal:30000
}
I know my delta table in MinIO works fine because I also use it for Trino, I was keen on testing with MinIO here instead of AWS S3.
Copy code
DeltaTable(
    table_uri=table_uri,
    storage_options=self.storage_options,
    version=version,
)
Anyone encountered this issue? I found this open issue: https://github.com/delta-io/delta-rs/issues/809 Seems like an open issue perhaps?
c

Cole MacKenzie

04/15/2023, 5:18 PM
Is your MinIO server running over HTTPS? You might have to specify
AWS_STORAGE_ALLOW_HTTP=1
if you are just using HTTP over local host
👀 1
s

S Thelin

04/15/2023, 5:21 PM
I run over
HTTP
in this case. I get the same issue. Currently tried this:
Copy code
os.environ["AWS_S3_ALLOW_UNSAFE_RENAME"] = "true"
        os.environ["AWS_STORAGE_ALLOW_HTTP"] = "1"

        return {
            "AWS_ACCESS_KEY_ID": os.environ["AWS_ACCESS_KEY_ID"],
            "AWS_SECRET_ACCESS_KEY": os.environ["AWS_SECRET_ACCESS_KEY"],
            "AWS_REGION": aws_region,
            "AWS_ENDPOINT_URL": "host.docker.internal:30000",
        }
Also tried to wrap it within
storage_options
with same result.
c

Cole MacKenzie

04/15/2023, 5:31 PM
Have you tried prefixing your
AWS_ENDPOINT_URL
with
http://
?
"AWS_ENDPOINT_URL": "<http://host.docker.internal:30000>"
s

S Thelin

04/17/2023, 8:13 PM
Yes I tried that as well. However I got it working now @Cole MacKenzie. I defined my service like this:
Copy code
apiVersion: v1
kind: Service
metadata:
  name: trino-minio-svc
  namespace: trino
spec:
  type: NodePort
  ports:
    - name: "9000"
      port: 9000
      targetPort: 9000
      nodePort: 30000
    - name: "9001"
      port: 9001
      targetPort: 9001
      nodePort: 30001
  selector:
    app: minio
I then configured it like:
Copy code
@staticmethod
def _setup_storage_options(aws_region: str) -> Dict[str, str]:
    os.environ["AWS_S3_ALLOW_UNSAFE_RENAME"] = "true"
    os.environ["AWS_STORAGE_ALLOW_HTTP"] = "1"

    return {
        "AWS_ACCESS_KEY_ID": os.environ["AWS_ACCESS_KEY_ID"],
        "AWS_SECRET_ACCESS_KEY": os.environ["AWS_SECRET_ACCESS_KEY"],
        "AWS_REGION": aws_region,
        "AWS_ENDPOINT_URL": "<http://localhost:30000>",
    }
This worked. When I browse in my local browser,
30001
is what I use to access the
UI
. But to make the calls I have to use port
9000 -> 30000
I then also tried with:
"trino-minio-svc.trino:9001"
but that did not work to my surprise. I would have imagined it would be able to pick this up, but yeah there is something to it here with the
http
and
https
. And if you add
http://
to the
service.namespace
it freaks out. If I configure an ingress I am sure that would work as well.
So TL;DR
AWS_STORAGE_ALLOW_HTTP=1
and then utilising the
localhost with http://
and the configured
NodePort
, it worked.
🙌 1
5 Views