Azure Purview March Updates

Azure Purview March Updates

Azure Purview updates

Announcements

Last week during SQLBITS, quite a few new updates were announced. I would like to include you in these announcements.

March updates

Support for SAP Business Warehouse (Preview)

Blogpost:

https://techcommunity.microsoft.com/t5/azure-purview-blog/azure-purview-adds-support-for-sap-business-warehouse/ba-p/3253404

Documentation:

https://docs.microsoft.com/en-us/azure/purview/register-scan-sap-bw

Azure Purview SAP BW

Dynamic lineage extraction from Azure SQL Databases (Preview)

Documentation:

https://docs.microsoft.com/en-us/azure/purview/register-scan-azure-sql-database?tabs=sql-authentication#lineagepreview

Video:

 

Certify assets in the Azure Purview data catalog

Blogpost:

https://techcommunity.microsoft.com/t5/azure-purview-blog/certify-assets-in-the-azure-purview-data-catalog/ba-p/3249460

Documentation:

https://docs.microsoft.com/en-us/azure/purview/how-to-certify-assets

Purview_Certified_Datasets

Ability to delete child terms when parent term is deleted

Documentation:

https://docs.microsoft.com/en-us/azure/purview/how-to-create-import-export-glossary

Connect to and manage an on-premises SQL server instance in Azure Purview

Documentation:

https://docs.microsoft.com/en-us/azure/purview/register-scan-on-premises-sql-server

Approval workflow for business terms (Preview)

Before you can start Authoring your workflows make sure you the correct user to the role assignment Workflow administrators, if you haven't done that correctly the option will be greyed out.

Purview_Workflow_Admin

workflow-authoring-experience

Blogpost:

Approval workflow for business glossary

Documentation:

https://docs.microsoft.com/en-us/azure/purview/how-to-workflow-business-terms-approval

Self-service data access workflows for hybrid data estates (Preview)

Purview-data-access-request

Documentation:

https://docs.microsoft.com/en-us/azure/purview/how-to-workflow-self-service-data-access-hybrid

Azure integration runtime supports scanning more source types

Azure Purview now supports scanning Snowflake, Salesforce, PostgreSQL, MySQL, Cassandra and Looker using managed Azure integration runtime.

Blogpost:

https://techcommunity.microsoft.com/t5/azure-purview-blog/azure-integration-runtime-supports-scanning-more-source-types/ba-p/3254148

Documentation:

https://docs.microsoft.com/en-us/azure/purview/manage-integration-runtimes

Localization

Azure Purview is localized in 18 languages. To change the language used, go to the Settings from the top bar and select the desired language from the dropdown.

Purview-Localization

Blogpost:

https://techcommunity.microsoft.com/t5/azure-purview-blog/localization-generally-available-in-azure-purview-studio/ba-p/3249453

Documentation:

https://docs.microsoft.com/en-us/azure/purview/use-azure-purview-studio#localization

My first Virtual session in 2022 for Dataminds

My first Virtual session in 2022 for Dataminds

DataMinds

This Tuesday I've joined the DataMinds user Group to talk about Azure Purview.

 

Migrate Azure Storage to Azure Data Lake Gen2

Migrate Azure Storage to Azure Data Lake Gen2

Migrate Azure Storage to Storage Account with Azure Data Lake Gen2 capabilities

Does it sometimes happen that you come across a Storage Account where the Hierarchical namespace is not enabled or that you still have a Storage Account V1? In the tutorial below I describe the different steps that have recently become possible to perform this migration.

Azure Storage V1

The first step is to check what Account kind is currently deployed. If this is Storage (general purpose v1), we first need to Migrate the Storage account to V2, if this is already V2 then go to the next step.

Storage V1 Account

You can click on change and a new window will pop-up.

Upgrade Storage Account

Note: Choosing a storage access tier during account upgrade is free. Changing the storage access tier after the upgrade operation may result in changes to your bill.

Select the Tier you want to Migrate to, once you have done that start the Upgrade.

Start Migration

When the upgrade is successful, you will see that the Account kind is now StorageV2. We can now continue to the next step.

Blob_Migration_V1_result

Azure Storage V2

To start the Migration click in the Taskbar on Data Lake Gen2 upgrade or click in the blob service properties on ‘Disabled’ for the Hierarchical namespace property.

The Migration window will open and we can start with step 1.

Blob_Migration_V2

Take notice of the unsupported features/functionalities.

Blob_Migration_V2_step1

Agree with implications of Upgrading your Azure Data Lake Storage. Once this step is done we can continue with step 2, the validation.

If everything runs fine, you can start the upgrade step 3. If it fails check the errors. You need to download the error.json file to check which blobs are failing, mostly this are the unsupported functionalities or incompatible features.

{
“startTime”: “2021-08-04T18:40:31.8465320Z”,
“id”: “45c84a6d-6746-4142-8130-5ae9cfe013a0”,
“incompatibleFeatures”: [
“Blob Delete Retention Enabled”
],
“blobValidationErrors”: [],
“scannedBlobCount”: 0,
“invalidBlobCount”: 0,
“endTime”: “2021-08-04T18:40:34.9371480Z”
}

 

The upgrade will take a while, this mostly depends on how much data needs to be migrated.

At the end of the process you notice that the Hierarchical namespace is now enabled and can not be changed anymore.

Blob_Migration_V2_finished

Post Migration

Create new linked services in Azure Data Factory and Azure Synapse Analytics to make sure that you will use the DFS file system.

Change any other application to the correct End Point.

Test, test and Test all your workloads to make sure everything is working like expected.

Start migrating your Development Storage Account, test all the workloads, before you start Migrating your Production Storage account.

 

Like always, in case you have questions, leave them in the comments or send me a message.

Useful links

Upgrade to a general-purpose v2 storage account

Upgrade Azure Blob Storage with Azure Data Lake Storage Gen2 capabilities

My Virtual session DataWeekender 4.2

My Virtual session DataWeekender 4.2

DataWeekender 4.2

This Saturday I've joined the Van and Spoke at DataWeekender

Azure Purview

I presented a session on Azure Purview Microsoft's answer to Data Governance and Data Lineage

You can find my slides below on Slideshare:

 
Some useful links:
 
 
 
 
 
 
 
 
 

 

As always, in case you have any questions, please feel free to contact me.

In case you have any questions left please feel free to ask them via the comment or Socials

Enable Pattern Rules in Azure Purview

Enable Pattern Rules in Azure Purview

How can I enable Pattern Rules?

​Pattern Rules

Last night I was preparing for a demo with Azure Purview. As always, I walk through all the activity hubs to see if there are any new options. This time I noticed that the Pattern Rules option was greyed out.

Azure_purview_pattern_rules

Resource Set

To enable this Pattern Rules you need to enable the option Advanced Resource Sets in the Management Activity tab.

Azure_purview_advanced_resource_set

The Resource set was already present in my Purview Account which was created before August 19th, so it was surprise for me that the pattern rules where greyed out for me.
My Demo Purview account was created after August 19th and there differences between the 2 versions and available options/features. What has changed Azure Purview after August 19th can be read in my previously written blog.

Once you have enabled this feature, the Azure Purview team recommends waiting an hour before scanning in new Data Lake data.  After scanning your Data Lake data manual or scheduled, you will see the Resource Sets.

Azure_purview_resource_set

When advanced resource sets feature is on, asset and classification insights will only update twice a day(every 12 hours).

More details on how to create Resource Set Pattern Rules, can be found here.

Costs

When you have enabled Advanced Resource Set feature you will be charged €0.18 per 1 vCore Hour(Free in preview). Billing for processing the resource set data assets is serverless and based on the duration of the processing, which can vary based on the change in partitioned files and resource set profile configured.

If you have any questions regarding the above, please let me know.