Azure Synapse Analytics overwrite live mode

by Sep 23, 2021

Stale publish branch

In Azure Synapse Analytics and Azure Data Factory is an new option available “Overwrite Live Mode“, which can be found in the Management Hub-Git Configuration.

With this new option your can directly overwrite your Azure Synapse Analytics or Azure Data Factory Live mode code with the current Branch from your Azure Dev Ops.

It will use the Publish option to overwrite everything into your Azure Synapse Analytics or Azure Data Factory, so be careful with doing this. If you have a lot of code, the deployment time can take a while based on the size  of the branch and the number of resources.

Synapse_overwritemode

Once you click on Preview Changes you will see that all your code will be published. You need to confirm by clicking the Overwrite button.

Synapse_overwritemode_Publish

After you clicked on overwrite, it will start publishing.

Why?

Sometimes your Live Mode has a different code than your current Git Branch, especially when it comes to Linked Services, Managed Vnets and when using multiple Feature Branches. Incidentally, this is also the case if you link your code (Solution Templates) to your Azure Synapse Workspace from Dev Ops for the first time. Then it is possible that you will not get this code published because there are still dependencies, what I’ve seen mostly because the use of Azure Key Vault or different Integration Runtime setup. According to the documentation from Microsoft which you can find here they add the following examples:

  • A user has multiple branches. In one feature branch, they deleted a linked service that isn’t AKV associated (non-AKV linked services are published immediately regardless if they are in Git or not) and never merged the feature branch into the collaboration branch.
  • A user modified the Synapse or data factory using the SDK or PowerShell
  • A user moved all resources to a new branch and tried to publish for the first time. Linked services should be created manually when importing resources.
  • A user uploads a non-AKV linked service or an Integration Runtime JSON file manually. They reference that resource from another resource such as a dataset, linked service, or pipeline. A non-AKV linked service created through the UX is published immediately because the credentials need to be encrypted. If you upload a dataset referencing that linked service and try to publish, the UX will allow it because it exists in the git environment. It will be rejected at publish time since it does not exist in the Synapse or data factory service.

If the publish branch is out of sync with your collaboration branch and contains out-of-date resources despite a recent publish, you can use the solution above.

Conclusion

I used to disconnect my Git configuration, make the changes in Live Mode, and reconnect Azure Dev Ops again and imported the resource to my current Branch. This solution makes it much easier and will safe you definitely a lot of time.

If you haven’t yet linked your Azure Synapse Workspace to Azure Dev Ops, read how to do this in a previous Blog.

Hopefully this article has helped you a step further. As always, if you have any questions, leave them in the comments.

Feel free to leave a comment

11 Comments

  1. corbin

    Hi Erwin,

    Can you confirm what permission is required to “Overwrite Live Mode”, cannot find any docs on this and we have synapse admin RBAC but the button is greyed out

    Reply
    • Erwin

      Hi Corbin,

      Great question, you need to have the contributor or owner role on the Synapse workspace.

      Reply
      • corbin

        Thanks for the quick reply erwin,

        Im attempting to change the publish branch from workspace_publish to master using the publish_config.json file as instructed by MSFT, surprise… it doesnt work 🙁

        Do you have any experience changing the publish branch

        Reply
        • Erwin

          Corbin,

          My advice is not change the workspace_publish branch to main or master . Leave as is or change it to xxxxx_publish. To do that add a file name publish_config.json in your collaboration branch(develop) with the desired name.
          Update your collaboration branch and click on publish. Your new publish branch will automatically be created. Or create in DevOps a new publish branch and assign this branch in your Synapse workspace through the git configuration-settings. Let me know if this worked well for you.

          Reply
          • corbin

            Erwin,

            What is the reasoning behind this please?

            I followed bradley balls guide here

            https://techcommunity.microsoft.com/t5/data-architecture-blog/ci-cd-in-azure-synapse-analytics-part-3/ba-p/1993201

            More specifically this comment:


            Hi Chris, You are spot on. I meant to change workspace_publish to main and should have done so on the blog. I need to check with Buck, I do not have the ability to edit the post. My preference is for the JSON generated with the workspace to go straight to main. This can trigger a new build process in a more automated fashion consistent with CICD principals.

            I have managed to change the branch now, it appears the publish_config needs to be in the synapse workspace root folder as defined in git config and not in the repository root as i originally thought

            I am finding issues probably like yourself where the dev/deploy process for synapse is still very much in its infancy and documentation is poor or non-existent

          • Erwin

            Corbin,

            There are more ways to follow, as long there’s no automated publish functionality for Azure Synapse we use this approach. We publish Synapse to the workspace_publisg branch and create the release pipeline on that branch with approval gates to test/acceptance and production. This is different approach for other Azure Data Services we follow within our develop or main branch, we’re we use PR’s.
            It al depends on what you’re used to. Bradley’s way is a way to go, but they’re any many others. I do like the way Bradley is doing it and definitely going to have a look into it.

  2. corbin

    Erwin,

    Thanks for taking the time to reply, its really helping me and i hope our conversation may help others too..

    I ended up switching the publish branch back to `workspace_publish` due to hitting a permissions error when setting the branch to master, basically it was requiring us to allow push master perms for the user, this would pretty much void all PR policies so we reverted back to the default workspace_publish, i have posted a comment on bradleys blog asking how he handled this.

    I now have the templates successfully deploying to the Dev environment “WINNER!”, i am now getting ready to deploy to the release environment but i have a concern/query.

    I have overridden the ARM template parameters in TemplateParametersForWorkspace.json as required so this all seems fine, however i have noticed in the TemplateForWorkspace.json file we have 346 references to our dev environment, these look to be sqlPool references and the like e.g.

    within the pipelines section under ifTrueActivities there is a reference to the sqlPool for our dev environment see pastebin https://pastebin.com/0QQLbJr2

    question is, does the extension “magically” sort this out or do i need to override each of these params using https://docs.microsoft.com/en-us/azure/synapse-analytics/cicd/continuous-integration-delivery#custom-parameter-syntax

    Do you have any experience with this?

    Reply
  3. corbin

    Erwin,

    Thank you again for taking the time to discuss this with me, its really appreciated

    I have resolved this by naming the sparkPool and sqlPools the same across all environments, it seems this is the recommended approach,

    https://craigporteous.com/adventures-in-ci-cd-with-azure-synapse-data-toboggan-session/

    The above article/video really helped me with this..

    It seems if you do not have these pools named the same you need to use a custom template file which is very code heavy and not exactly simple..

    I am going to create a blog post documenting my journey and the challenges faced in the hope it will help others, i will reference yours and craigs article as they have really helped me and MSFT documentation misses key points

    Thank you again for you help

    Reply
    • Erwin

      Great to hear Corbin,

      Yes you need to have the same names for spark and sql pools. If you use custom Azure IR of Self Hosted IR, you must have also the same name across all your environments. If you going start using the Data Explorer pool, same situation

      Looking forward to your blogpost

      Reply
  4. corbin

    Hi Erwin,

    Me again

    I am still running into concerns, (though ive not actually tried a release deployment yet)

    In the generated ARM templates there are still references to the dev environment, these are things like notebooks etc..

    The spark pool name is the same across all environments but the resource group and subsequent workspace are still pointed to dev

    example

    “a365ComputeOptions”: {
    “id”: “/subscriptions/subId/resourceGroups/devcoresynapseuksrg/providers/Microsoft.Synapse/workspaces/devsynws01uks/bigDataPools/zpspk01”,
    “name”: “zpspk01”,
    “type”: “Spark”,
    “endpoint”: “https://devsynws01uks.dev.azuresynapse.net/livyApi/versions/2019-11-01-preview/sparkPools/zpspk01”,

    Also keyvault references are dev related also..

    I cant see how it just works this out, but it must do? coz you cant use the same resource group and workspace name across environments

    Am i missing something?

    Reply

Submit a Comment

Your email address will not be published.

2 × 1 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Service Healths in Azure

Creating Service Health Alerts in AzureAzure Portal In the Azure Portal go to Monitor – Service Health – Health alerts If you have created alerts before you will see them over here. Assuming you haven’t created an Alert before, we will start to create an Alert.1...

Get control of data loads in Azure Synapse

Load Source data to DataLake There are several ways to extract data from a source in Azure Synapse Analytics or in Azure Data Factory. In this article I'm going to use a metadata-driven approach by using a control table in Azure SQL in which we configure the...

Change your Action Group in Azure Monitoring

Change a Action GroupPrevious Article In my previous artcile I wrote about how to create Service Helath Alerts. In this article you will learn how to change the Action Group to add, change or Remove members(Action Group Type Email/SMS/Push/Voice) Azure Portal In the...

How to create a Azure Synapse Analytics Workspace

Creating your Azure Synapse Analytics Workspace In the article below I would like to take you through,  how you can configure an Azure Synapse Workspace and not the already existing Azure Synapse Analytics SQL Pool(formerly Azure SQL DW): In de Azure Portal search for...

Exploring Azure Synapse Analytics Studio

Azure Synapse Workspace Settings In my previous article, I walked you through "how to create your Azure Synapse Analytics Workspace". It's now time to explore the brand new Synapse Studio. Most configuration and settings can be done through the Synapse Studio. In your...

Azure SQL Data Warehouse: Reserved Capacity versus Pay as You go

How do I use my Reserved Capacity correctly? Update 11-11-2020: This also applies to Azure Synapse SQL Pools. In my previous article you were introduced, how to create a Reserved Capacity for an Azure SQL Datawarehouse (SQLDW). Now it's time to take a look at how this...

Migrate Azure Storage to Azure Data Lake Gen2

Migrate Azure Storage to Storage Account with Azure Data Lake Gen2 capabilities Does it sometimes happen that you come across a Storage Account where the Hierarchical namespace is not enabled or that you still have a Storage Account V1? In the tutorial below I...

Updated Microsoft Purview Pricing and Applications

Microsoft Purview Pricing and introduction of Purview Applications The Microsoft Purview pricing page has been updated. Below I have listed most of the changes. The most important changes are the introduction of the Microsoft Purview Applications and the pricing of...

Azure Data Factory Let’s get started

Creating an Azure Data Factory Instance, let's get started Many blogs nowadays are about which functionalities we can use within Azure Data Factory. But how do we create an Azure Data Factory instance in Azure for the first time and what should you take into account? ...

Create Virtual Machines with Azure DevTest Lab

A while ago I had to give a training. Normally I would roll out a number of virtual machines in Azure. Until someone brought my attention to an Azure Service, Azure DevTest Labs. With this Azure service you can easily create a basic image and use this image to roll...