Azure Purview announcements and new functionalities

by Aug 19, 2021

This week the Azure Purview Product team added some new functionalities, new connectors(these connectors where added during my holiday), Azure Synapse Data Lineage, a better Power BI integration and the introduction of Elastics Data Map. Slowly we are on our way to a GA status, on September 2021, 28th there will be a Digital Event. Please find below some of announcements in detail.

New connectors in Azure Purview

Over the past period, the Azure Purview team has worked hard, they have already added the necessary new connectors such as ERWIN, Looker, Cassandra and Google Big Query.Purview_NewSources

This week it was time for some new functionalities.

Azure Synapse Analytics Data Lineage:

This functionality currently only works for a copy activity, but the first step has been made. Where for Lineage from Azure Data Factory you still had to make a link in Azure Purview, for the Lineage from Azure Synapse, it is the other way around. You create the link to Azure Purview in Azure Synapse. How to create this link I described this a couple of months ago in one of my post and can be found here.

Some known limitations on copy activity lineage based on the docs.

Currently, if you use the following copy activity features, the lineage is not yet supported:

  • Copy data into Azure Data Lake Storage Gen1 using Binary format.
  • Copy data into Azure Synapse Analytics using PolyBase or COPY statement.
  • Compression setting for Binary, delimited text, Excel, JSON, and XML files.
  • Source partition options for Azure SQL Database, Azure SQL Managed Instance, Azure Synapse Analytics, SQL Server, and SAP Table.
  • Source partition discovery option for file-based stores.
  • Copy data to file-based sink with setting of max rows per file.
  • Add additional columns during copy.

In additional to lineage, the data asset schema (shown in Asset -> Schema tab) is reported for the following connectors:

  • CSV and Parquet files on Azure Blob, Azure File Storage, ADLS Gen1, ADLS Gen2, and Amazon S3
  • Azure Data Explorer, Azure SQL Database, Azure SQL Managed Instance, Azure Synapse Analytics, SQL Server, Teradata

Power BI

Power BI supports now  automated discovery of columns, measures and  datatypes of  the Power BI.

To enable this functionality you much enable the following settings in the Power BI tenant setting page(be aware that you need to be a Power BI Admin)

Allow service principals to use read-only Power BI admin APIs.

To use this setting create a Security group or use an existing one and add your Purview account to this SG.

Purview_PowerBI_API
Enhance admin APIs responses with detailed metadata
Purview_PowerBI_Metadata

Elastic data map in Azure Purview

All Purview account created after August 2021, 18th are now created with the new Elastic data map concept. With this new concept your Purview account will come by default  with one capacity unit and elastically grow based on usage. Each Data Map capacity unit includes a throughput of 25 operations/sec and 2 GB of metadata storage limit. So now when you’re not using Purview you’re not paying the default value of 4 capacity units.

Purview_Account

The Data Map is billed on an hourly basis. You are billed for the maximum Data Map capacity unit needed within the hour. At times, you may need more operations/second within the hour, and this will increase the number of capacity units needed within that hour. At other times, your operations/second usage may be low, but you may still need a large volume of metadata storage. The metadata storage is what determines how many capacity units you need within the hour. Please read the documentation for a more detailed explanation and some examples

All existing Azure Purview accounts will be migrated in September/October to the Elastics data map concept.

The big question that remains open is what exactly does this Capacity Unit cost? For the time being during the Preview, it is still free, which can be read from the updated  price page of Azure Purview..

More clarity about pricing and when Azure Purview goes to GA is likely to become clear during the event on September 28. You can register for this event via the link below.

EVENT=>Achieve unified data governance with Azure Purview

 

Purview_Event

 

As always, in case you have any questions, please feel free to contact me.

Feel free to leave a comment

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

ten − 4 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Azure SQL Data Warehouse: Reserved Capacity versus Pay as You go

How do I use my Reserved Capacity correctly? Update 11-11-2020: This also applies to Azure Synapse SQL Pools. In my previous article you were introduced, how to create a Reserved Capacity for an Azure SQL Datawarehouse (SQLDW). Now it's time to take a look at how this...

Azure DevOps and Azure Feature Pack for Integration Services

Azure Feature Pack for Integration ServicesAzure Blob Storage A great addition for SSIS is using extra connectors like  Azure Blob Storage or Azure Data Lake Store which are added by the Azure Feature Pack. This Pack needs to be installed on your local machine. Are...

Service Healths in Azure

Creating Service Health Alerts in AzureAzure Portal In the Azure Portal go to Monitor – Service Health – Health alerts If you have created alerts before you will see them over here. Assuming you haven’t created an Alert before, we will start to create an Alert.1...

Using Azure Automation to generate data in your WideWorldImporters database

CASE: For my test environment I want to load every day new increments into the WideWorldImporters Azure SQL Database with Azure Automation. The following Stored Procedure is available to achieve this. EXECUTE DataLoadSimulation.PopulateDataToCurrentDate...

How to create a Azure Synapse Analytics Workspace

Creating your Azure Synapse Analytics Workspace In the article below I would like to take you through,  how you can configure an Azure Synapse Workspace and not the already existing Azure Synapse Analytics SQL Pool(formerly Azure SQL DW): In de Azure Portal search for...

Scale your SQL Pool dynamically in Azure Synapse

Scale your Dedicated SQL Pool in Azure Synapse Analytics In my previous article, I explained how you can Pause and Resume your Dedicated SQL Pool with a Pipeline in Azure Synapse Analytics. In this article I will explain how to scale up and down a SQL Pool via a...

Create an Azure Synapse Analytics Apache Spark Pool

Adding a new Apache Spark Pool There are 2 options to create an Apache Spark Pool.Go to your Azure Synapse Analytics Workspace in de Azure Portal and add a new Apache Spark Pool. Or go to the Management Tab in your Azure Synapse Analytics Workspace and add a new...

Azure Data Factory Naming Conventions

Naming Conventions More and more projects are using Azure Data Factory and Azure Synapse Analytics, the more important it is to apply a correct and standard naming convention. When using standard naming conventions you create recognizable results across different...

Azure Data Factory: How to assign a Customer Managed Key

Customer key With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key...

Get control of data loads in Azure Synapse

Load Source data to DataLake There are several ways to extract data from a source in Azure Synapse Analytics or in Azure Data Factory. In this article I'm going to use a metadata-driven approach by using a control table in Azure SQL in which we configure the...