0 votes
in Azure by

How do you connect Azure Databricks to an Azure storage account?

1 Answer

0 votes
by
  1. Create a Storage Account and a private container in which you will upload a blob file.
  2. Once you upload the blob file, select Generate SAS from the context menu.
  3. Copy the blob SAS Token and save it for future use.
  4. Make an Azure Databricks account. Now, click on Create and pick the subscription (if you have any) and the resource group name (if you have any). Select the location where you want to build these data bricks and then the pricing tier.
  5. Click Review + Create, then wait for the validation to complete.
  6. Once your validation is done, click Create.
  7. Once your deployment is complete, click the Go to resource option.
  8. Click on Launch Workspace, and it will redirect you to the Azure Databricks page.
  9. Now, in the left pane, select Clusters and then Create Cluster, giving the cluster a name and selecting Standard as the Cluster-Mode.
  10. Now you must start your cluster and ensure that it is operational.
  11. In the left pane, right-click on workspace -> create -> notebook.
  12. Now assign a name for the notebook, choose Scala as the default language, and choose the previous cluster you built before clicking on Create.
  13. To connect your storage account, place the following code into the notebook.
  14. val containerName = ""
  15. val storageAccountName = ""
  16. val sas = ""
  17. val config = "fs.azure.sas." + containerName+ "." + storageAccountName + ".blob.core.windows.net"
  18. dbutils.fs.mount(
  19.   source = "wasbs://"+containerName+"@"+storageAccountName+".blob.core.windows.net/employe_data.csv",
  20.   extraConfigs = Map(config -> sas))
  21. val mydf = spark.read.option("header","true").option("inferSchema", "true").csv("/mnt/myfile")
  22. display(mydf)
  23. You've successfully connected your Azure DataBricks to your storage account if you can retrieve the relevant data.
...