ERWIN & BUSINESS ANALYTICS

How to setup Code Repository in Azure Data Factory

by Nov 5, 2020

Why activate a Git Configuration?

The main reasons are:

  1. Source Control: Ensures that all your changes are saved and traceable, but also that you can easily go back to a previous version in case of a bug.
  2. Continuous Integration and Continuous Delivery (CI/CD): Allows you to Create build and release pipelines for easy release to other Data Factory instance, manually or triggered(DTAP).
  3. Collaboration: You have the ability to easily collaborate in the same Data Factory with different colleagues.
  4. Performance: Your Data Factory from Git is 10 times faster then loading directly from the Data Factory Service.

So enough reasons to start enabling your Git Configuration.

How to setup your Code Repository in Azure Data Factory!

During the configuration/set up of your Data Factory you have the possibility to select either Azure DevOps or GitHub as your Git Configuration. If you haven’t done that, you can still configure this integration in Azure Data Factory. The procedure for both options are the same.
Create ADF Version Control
In my previous article, Creating an Azure Data Factory Instance, I skipped the Git Configuration. In this article I will explain how to do this in an already created Data Factory.

Azure Data Factory Source Control

On the right of your splash screen when opening your Data Factory select the Setup Code Repository. Other options to start configuring your Code Repository are through the Management Hub or in the UX on the top left in the authoring canvas. If you don’t see the option, Code Repository is already configured. You can check this in the Management Hub or UX.

We have the option to configure Azure DevOps or GitHub.

Azure DevOps integration

First I will take you through the configuration of Azure DevOps and then also create a similar configuration in GitHub. If you want to start directly in GitHub, click here.

Select Azure DevOps Git:

Azure Dev Ops Config

  1. Azure Active Directory: Select the AAD where your Azure DevOps environment is located. If you use another AAD, make sure that this account has rights to that environment.
  2. Azure DevOps Account: Select your Account.
  3. Project Name: Select the Project Name where you want to store your repository in.
  4. Git Repository: Create a new Project.
  5. Collaboration Branch: Change this to Main.
  6. Publish Branch: Leave this on adf_publish.
  7. Root folder: If you want to create a complete project with SQL,Azure Analysis Service, Azure DataBricks etc etc, you define a root folder and create your repository into that folder.
  8. Import: When this is a blank Data Factory, you can disable this option. When you have create already resources in your Data Factory, you should enable this so already created resources are committed to the repository.

Click on apply and you will see that you repository is connected.

Repo Connected

When you log in to your Azure Dev Ops Environment, you will see that a new Repository is created Main Branch. Azure Dev Ops Main

Go back to your Data Factory and click on Publish.

Data Factory Publish

In Azure DevOps the adf_publish Branch is now also created.Azure Dev Ops Publish

GitHub Integration

In the repository screen, select GitHub:

Github login

The first time you connect with your Data Factory you need to login in GitHub.

Github authorize

Once connect you to need to Authorize your Data Factory.

Github Configuration

All the settings are almost the same as in Azure DevOps:

  1. Use GitHub Server Enterprise: If enabled fill the The GitHub Enterprise root URL.
  2. GitHub Account: Select your Account.
  3. Project Name: Select the Project Name where you want to store your repository in.
  4. Git Repository: Create a new Project.
  5. Collaboration Branch: Leave this on Main.
  6. Publish Branch: Leave this on adf_publish.
  7. Root folder: If you want to create a complete project with SQL, Azure Analysis Service, Azure DataBricks etc etc, you define a root folder and create your repository into that folder.
  8. Import: When this is a blank Data Factory, you can disable this option. When you have create already resources in your Data Factory, you should enable this so already created resources are committed to the repository.

Click on apply and you will see that you repository is connected.

Repo Connected

Log in to your GitHub, a new Repository is created Main Branch. If you go back to your Data Factory and click on Publish.

Data Factory Publish

In GitHub the adf_publish Branch is now also created.

GitHub Publish

As you can see the Setup for Azure Dev Ops and GitHub are mostly the same. You have now learned how to connect your Data Factory to a Code Repository. You’re now ready to start building your Release and build pipeline’s.

Thanks for reading and in case you have some questions, please leave them in the comments below.

Feel free to leave a comment

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

ten − 6 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Azure Synapse Pause and Resume SQL Pool

Pause or Resume your Dedicated SQL Pool in Azure Synapse Analytics Azure Synapse Analytics went GA in beginning of December 2020, with Azure Synapse we can now also create a Dedicated SQL Pool(formerly Azure SQL DW). Please read this document to learn what a Dedicated...

Azure Purview Costs in Public Preview explained

Why are you potentially charged for Azure Purview during the public preview? Since I published my post that Azure Purview started billing as of January 21th, I got a lot of questions how billing was working. UPDATE 27th of February: We are extending the Azure Purview...

Azure SQL Data Warehouse: Reserved Capacity versus Pay as You go

How do I use my Reserved Capacity correctly? Update 11-11-2020: This also applies to Azure Synapse SQL Pools. In my previous article you were introduced, how to create a Reserved Capacity for an Azure SQL Datawarehouse (SQLDW). Now it's time to take a look at how this...

Create an Azure Synapse Analytics Apache Spark Pool

Adding a new Apache Spark Pool There are 2 options to create an Apache Spark Pool.Go to your Azure Synapse Analytics Workspace in de Azure Portal and add a new Apache Spark Pool. Or go to the Management Tab in your Azure Synapse Analytics Workspace and add a new...

Azure Data Factory Naming Conventions

Naming Conventions More and more projects are using Azure Data Factory and Azure Synapse Analytics, the more important it is to apply a correct and standard naming convention. When using standard naming conventions you create recognizable results across different...

Azure Data Factory Let’s get started

Creating an Azure Data Factory Instance, let's get started Many blogs nowadays are about which functionalities we can use within Azure Data Factory. But how do we create an Azure Data Factory instance in Azure for the first time and what should you take into account? ...

Exploring Azure Synapse Analytics Studio

Azure Synapse Workspace Settings In my previous article, I walked you through "how to create your Azure Synapse Analytics Workspace". It's now time to explore the brand new Synapse Studio. Most configuration and settings can be done through the Synapse Studio. In your...

Using Azure Automation to generate data in your WideWorldImporters database

CASE: For my test environment I want to load every day new increments into the WideWorldImporters Azure SQL Database with Azure Automation. The following Stored Procedure is available to achieve this. EXECUTE DataLoadSimulation.PopulateDataToCurrentDate...

How to create a Azure Synapse Analytics Workspace

Creating your Azure Synapse Analytics Workspace In the article below I would like to take you through,  how you can configure an Azure Synapse Workspace and not the already existing Azure Synapse Analytics SQL Pool(formerly Azure SQL DW): In de Azure Portal search for...

Connect Azure Synapse Analytics with Azure Purview

How do you integrate Azure Purview in Azure Synapse Analytics? This article explains how to integrate Azure Purview into your Azure Synapse workspace for data discovery and exploration. Follow the steps below to connect your Azure Purview account in your Azure Synapse...