Provision users and groups from AAD to Azure Databricks (part 3)

by Jan 19, 2023

In the previous blog you learned how to sync and assign users and groups to the Enterprise Application. In this blog, you will learn how to create a metastore and assign it to Azure Databricks workspaces to. This is a prerequisite to be able to assign users and groups to the Azure Databricks workspaces.

In the situation below we’re creating a metastore that is accessed using a managed identity, which is recommended situation.

Before you can create a Metastore, you need to create an Azure Databricks access connector, which is a first-party Azure resource that lets you connect a system-assigned managed identity to an Azure Databricks account.

Requirements :

  • You need to have an Azure Databricks account with a Premium Plan.
  • You must be an Azure Databricks account admin.
  • You must have an Azure Data Lake Storage Gen2 storage account(must be in the same region as your Azure Databricks Workspace).

Azure Databricks access connector

Log in to the Azure Portal as a Contributor or as an Owner of a resource group

adb-connector

Search for the Access Connector for Azure Databricks in the Marketplace and click on create.

Configure the Connector

adb-connector-create

  • Subscription: Select the subscription where you want to create the access connector in.
  • Resource group: This should be a resource group in the same region as the storage account that you will connect to.
  • Name: The name of the connector.
  • Region: Same region as the storage account that you will connect to.

Click Review + create.
When you see the Validation Passed message, click Create.

Grant the managed identity access to the storage account

  • Log in to your Azure Data Lake Storage Gen2 account as an Owner or a user with the User Access Administrator Azure RBAC role on the storage account.
  • Go to Access Control (IAM), click + Add, and select Add role assignment.
  • Select the Storage Blob Data Contributor role and click Next.
  • Under Assign access to, select Managed identity.
  • Click +Select Members, and select Access connector for Azure Databricks.
  • Search for your connector name, select it, and click Review and Assign.

mi-assign

Create the Metastore

Login to the Azure Databricks account console.

Click on the left side, click on the data setting icon.

Click on the “Create a Metastore” button.

 

create-metastore

  • Name for the metastore.
  • Region where the metastore will be deployed., this must be the same region as the workspaces, storage and access connector.
  • ADLS Gen 2 path: Enter the path to the storage container that you will use as root storage for the metastore.
  • Access Connector ID: Enter the Azure Databricks access connector’s resource ID, can be found on the main page of the access connector.

Click on the Create to create the Metastore.

If you see the following error, you forgot to assign the managed identity access to the storage account. You can Force Create the metastore and assign the managed identity afterwards.

access-violation

 

Assign Workspace to Metastore

The last steps is to assign the Workspaces to the Metastore. Click on the right side Assign to workspace.

You will only see all workspaces which have not been assigned earlier. Select the correct workspace and click on assign.

Enable the Unity Catalog and your workspaces are connected.

In my next blog I will explain how to Assign Users and groups to an Azure Databricks Workspace and define the correct entitlements.

Feel free to leave a comment

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

18 − 18 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create Virtual Machines with Azure DevTest Lab

A while ago I had to give a training. Normally I would roll out a number of virtual machines in Azure. Until someone brought my attention to an Azure Service, Azure DevTest Labs. With this Azure service you can easily create a basic image and use this image to roll...

Connect Azure Synapse Analytics with Azure Purview

How do you integrate Azure Purview in Azure Synapse Analytics? This article explains how to integrate Azure Purview into your Azure Synapse workspace for data discovery and exploration. Follow the steps below to connect your Azure Purview account in your Azure Synapse...

Get control of data loads in Azure Synapse

Load Source data to DataLake There are several ways to extract data from a source in Azure Synapse Analytics or in Azure Data Factory. In this article I'm going to use a metadata-driven approach by using a control table in Azure SQL in which we configure the...

Azure Synapse Analytics overwrite live mode

Stale publish branch In Azure Synapse Analytics and Azure Data Factory is an new option available "Overwrite Live Mode", which can be found in the Management Hub-Git Configuration. With this new option your can directly overwrite your Azure Synapse Analytics or Azure...

How to setup Code Repository in Azure Data Factory

Why activate a Git Configuration? The main reasons are: Source Control: Ensures that all your changes are saved and traceable, but also that you can easily go back to a previous version in case of a bug. Continuous Integration and Continuous Delivery (CI/CD): Allows...

Blog Serie: Provision users and groups from AAD to Azure Databricks

Blog Series This blog post series contains topics on how to Provision users and groups from Azure Active Directory to Azure Databricks using the Enterprise Application(SCIM). This is a summary of the all the blogs I posted the last couple of days. I am very happy with...

Change your Action Group in Azure Monitoring

Change a Action GroupPrevious Article In my previous artcile I wrote about how to create Service Helath Alerts. In this article you will learn how to change the Action Group to add, change or Remove members(Action Group Type Email/SMS/Push/Voice) Azure Portal In the...

Provision users and groups from AAD to Azure Databricks (part 4)

Assign Users and groups to Azure Databricks Workspace In the previous blog, you created the metastore in your Azure Databricks account to assign an Azure Databricks Workspace. In this blog, you will learn how to assign Users and Groups to an Azure Databricks Workspace...

Provision users and groups from AAD to Azure Databricks (part 5)

Add Service principals to your Azure Databricks account using the account console In the previous blog, you assigned Users and Groups to an Azure Databricks Workspace. In this blog, you will learn how to assign Service Principals to an Azure Databricks Workspace and...

SSMS 18.xx: Creating your Azure Data Factory SSIS IR directly in SSMS

Creating your Azure Data Factory(ADF) SSIS IR in SSMS Since  version 18.0 we could see our Integration Catalog on Azure Instances directly. Yesterday I wrote an article how to Schedule your SSIS Packages in ADF, during writing that article I found out that you can...