Data Sharing Lineage in Microsoft Purview

by Mar 13, 2023

In my previous blog post, I wrote how you can share data within your organization or across organizations. Now it’s time to have a look how the lineage will look like.

In this article I will explain the Microsoft Purview Data Sharing Lineage and not the Lineage for Azure Data Share. This can be found here.

Data lineage is the ability to track the flow of data from its source to its destination, including any transformations or processing that occur along the way. Lineage is important for several reasons. First, it can help businesses ensure the accuracy and quality of their data. By tracing the lineage of a particular piece of data, businesses can identify any errors or inconsistencies that may have been introduced during processing.

Lineage is also important for compliance and regulatory purposes. Businesses may be required to track the lineage of certain types of data in order to comply with regulations or to demonstrate the integrity of their data.

Data share assets discovery

Data share assets can now be discovered in the Microsoft Purview Catalog. The Data share asset label is as of today available as a new filter option.

The Data share assets include sent share and received share assets and users can see the properties such as share metadata, owners, contact information, etc. 

Azure Active Directory (AAD) tenant assets can be discovered in the catalog for all the tenants the current user tenant has sent or received data shares, to see the tenant-level Data Sharing Lineage. 

As you can see when browsing all the assets, you will discover 2 new types over here, Azure Active Directory and  Share.

Asset overview

Data Share Lineage

Data Sharing lineage aims to provide detailed information for root cause analysis and impact analysis.

Some common scenarios include:

  • Full view of datasets shared in and out of your organization
  • Root cause analysis for upstream dataset dependencies
  • Impact analysis for shared datasets

Lineage overview from the Data Receive Share view. As you can see, there is a new asset “Azure Active Directory Tenant”, in this case you will see from which tenant the data is coming from.

Below you see an overview of the lineage where we created the Share and to which tenant we shared the data to. As you can see, the lines of the AAD tenant are opposite of each other, so you can clearly see what is being shared and what is the receiving location.

Conclusion

Microsoft Purview’s lineage capabilities are a powerful tool for businesses that need to track the flow of their data. By providing a complete view of data lineage, Purview can help businesses ensure the accuracy and integrity of their data, comply with regulatory requirements, and improve the efficiency of their data processing workflows.

Thanks for reading and like always, if you have any questions leave them in the comments.

Documentation Links as reference:

How to share data in Microsoft Purview

How to receive shared data in Microsoft Purview

Microsoft Purview Data Sharing FAQ

Microsoft Purview Data Sharing Lineage

Feel free to leave a comment

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

thirteen − 11 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Connect Azure Databricks to Microsoft Purview

Connect and Manage Azure Databricks in Microsoft Purview This week the Purview team released a new feature, you’re now able to Connect and manage Azure Databricks in Microsoft Purview. This new functionality is almost the same as the Hive Metastore connector which you...

Migrate Azure Storage to Azure Data Lake Gen2

Migrate Azure Storage to Storage Account with Azure Data Lake Gen2 capabilities Does it sometimes happen that you come across a Storage Account where the Hierarchical namespace is not enabled or that you still have a Storage Account V1? In the tutorial below I...

Exploring Azure Synapse Analytics Studio

Azure Synapse Workspace Settings In my previous article, I walked you through "how to create your Azure Synapse Analytics Workspace". It's now time to explore the brand new Synapse Studio. Most configuration and settings can be done through the Synapse Studio. In your...

Change your Action Group in Azure Monitoring

Change a Action GroupPrevious Article In my previous artcile I wrote about how to create Service Helath Alerts. In this article you will learn how to change the Action Group to add, change or Remove members(Action Group Type Email/SMS/Push/Voice) Azure Portal In the...

Azure DevOps and Azure Feature Pack for Integration Services

Azure Feature Pack for Integration ServicesAzure Blob Storage A great addition for SSIS is using extra connectors like  Azure Blob Storage or Azure Data Lake Store which are added by the Azure Feature Pack. This Pack needs to be installed on your local machine. Are...

Provision users and groups from AAD to Azure Databricks (part 2)

Assign and Provision users and groups in the Enterprise Application In the previous blog you learned how to configure the Enterprise Application. In this blog, you will learn how to assign and Provision Users and Groups. Once the Users and groups are assigned to the...

Scale SQL Database dynamically with Metadata

Scale SQL Database Dynamically with Metadata Use this template to scale up and down an Azure SQL Database in Azure Synapse Analytics or in Azure Data Factory. This article describes a solution template how you can Scale up or down a SQL Database within Azure Synapse...

Azure Data Factory Let’s get started

Creating an Azure Data Factory Instance, let's get started Many blogs nowadays are about which functionalities we can use within Azure Data Factory. But how do we create an Azure Data Factory instance in Azure for the first time and what should you take into account? ...

Azure Synapse Analyics costs analyis for Integration Runtime

AutoResolveIntegrationRuntime! The last few days I've been following some discussions on Twitter on using a separate Integration Runtime in Azure Synapse Analytics running in the selected region instead of auto-resolve. The AutoResolveIntegrationRuntime is...

Azure Data Factory: Generate Pipeline from the new Template Gallery

Last week I mentioned that we could save a Pipeline to GIT. But today I found out that you can also create a Pipeline from a predefined Solution Template.Template Gallery These template will make it easier to start with Azure Data Factory and it will reduce...