Hello everyone, can delta sharing be used inside Azure ML notebooks either via pandas or pyspark?
k
Kris Geusebroek
04/05/2023, 6:46 AM
Well I suppose so. It is just http requests based on the deltasharing config. So if the networking allows for reaching the endpoint youβre good to go
π 1
j
Jim Hibbard
04/05/2023, 7:18 AM
Yes, Kris is right! Assuming there's no network issues, you should be good to go π
h
HQ
04/05/2023, 10:36 PM
Thanks for that. I've been trying to make delta sharing work via azure ML but seem to be running in a 403 "this request is not authorized" error. I've given the Azure ML compute instance's public IP the network access on the adls storage as well as pointed it to the "config.share" file that has the auth token. The same token works when I run delta sharing in a python poetry project and can load the tables as both pandas and spark dataframes. Do you folks have any suggestions on what could be happening on Azure ML?
j
Jim Hibbard
04/05/2023, 11:25 PM
I'm assuming this is at work, so correct me if that's wrong. I'd probably start by sending an email/ticket to your networking/firewall team to see if there's something set up to block this traffic before further troubleshooting. It's kinda tricky to give better advice remotely unfortunately since this is more of a runtime question than a delta-sharing question π
h
HQ
04/06/2023, 12:04 AM
Hi Jim, thanks for such a detailed response. I understand it's hard to troubleshoot remotely. In our use case, Azure ML can get to the storage, the delta sharing post request returns a 200 (please see attached screen). But when it tries to retrieve the storage contents is when the 403 error pops up (please refer to attached screen). So I dont think this is a networking issue?
Hello Jim, I can confirm that it was indeed a network/firewall issue, thank you for your help
π 1
j
Jim Hibbard
04/06/2023, 4:38 AM
Awesome! I'm glad you were able to track it down so quickly, hopefully this saved you some time π