Microsoft Purview pricing is changing!

Microsoft Purview pricing is changing!

Month: October 2024

Microsoft Purview’s New Pay-As-You-Go Pricing Model

UPDATE November 1, 2024

Pricing change will be postponed to January 6th, 2025.

Pricing Consent Purview

Starting November 1, 2024, Microsoft Purview is set to introduce a new pay-as-you-go pricing model for its Data Governance and Data Security capabilities. This update is designed to extend the benefits of Microsoft Purview beyond Microsoft 365, allowing organizations to manage costs more effectively by paying only for the resources they use.

Consent new Purview pricing

What’s New?

Switching to this new model brings several enhanced features and capabilities:

  • Enhanced Data Security Features: Now available for non-Microsoft 365 environments, these features include classification, labeling, and protection, ensuring robust security across various platforms.
  • Redesigned Data Governance Solution: This includes new capabilities such as:
    • Easy-to-Use, Business-Friendly Data Catalog: Simplifies data discovery and management for business users.
    • Top-Notch Data Quality and Health Management: Ensures high data quality and maintains the health of your data assets.
    • Built-In Governance Controls: Provides integrated controls to help manage and enforce data governance policies effectively.

Next Steps

Data Governance Customers

To take advantage of the new capabilities when they become available in your region, you need to consent to switch to the pay-as-you-go model by October 31, 2024. If you do not provide consent by this date, you will remain on the classic pricing model and lose access to the new Data Governance solution after November 2, 2024.

Data Security Customers

Starting November 1, 2024, the pay-as-you-go features for non-Microsoft 365 data in Insider Risk Management and Information Protection will transition from free to a paid preview. To continue using these features, you must consent to switch to the new model before February 28, 2025. If you do not consent by this date, you will lose access to these features, and any protection applied to non-Microsoft 365 data sources will be removed.

Pay-As-You-Go Billing Model

For organizations that operate in multi-cloud environments, the pay-as-you-go billing model offers greater flexibility. This model extends Microsoft Purview’s capabilities beyond Microsoft 365 to include environments such as Azure, AWS, GCP, Box, and Dropbox. The pay-as-you-go model charges based on actual usage, allowing organizations to scale their usage up or down as needed, providing cost efficiency and flexibility.

This model utilizes two types of meters:

  • Asset-Based Meter: This meter counts non-Microsoft 365 items, such as servers, tables, or files.
  • Processing Unit-Based Meter: This meter measures the compute units used for data security and governance tasks.

Microsoft Purview Data Catalog new pricing model with 2 meters that run based on:

  • Number of unique governed assets per day
  • Data Management processing units per run

More details on the what is a Governed Asset, can be found here and processing units can be found here.

Consent and Subscription

Existing Azure Purview customers need to provide consent to switch to the pay-as-you-go model. New customers can link their Azure subscription to start using these features immediately. This ensures a seamless transition and integration with existing Azure services.

Conclusion

Microsoft Purview’s billing models are designed to provide flexibility and scalability, catering to the unique needs of different organizations. Whether you are heavily invested in Microsoft 365 or operate across multiple cloud environments, Microsoft Purview offers a billing model that can help you manage your data governance and security efficiently.

By understanding these billing models, organizations can make informed decisions that align with their operational and financial goals, ensuring robust data governance and security in an ever-evolving digital landscape.

You have some guidelines to define the pricing. As soon as the new pricing model starts, I will try to make the a calculation example so that you will an example for your organization.

 

Links

Microsoft Purview Data Catalog

Microsoft Purview Data Catalog billing consent

Microsoft Purview data governance pricing concepts

Microsoft Purview data governance pricing announcement

 

 

Feel free to leave a comment

High Concurrency for Notebooks in Pipelines with Microsoft Fabric

High Concurrency for Notebooks in Pipelines with Microsoft Fabric

Month: October 2024

How to Use and Enable High Concurrency for Notebooks in Pipelines with Microsoft Fabric

High Concurrency Mode for Notebooks in Pipelines is a game-changer for data engineers and data scientists using Microsoft Fabric. This feature allows multiple notebooks to share a single Spark session, significantly improving performance and reducing costs. One of the other advanced is as well that Microsoft Fabric is not running to all the capacity limits due to the fact that every Notebook was starting a new session. In one of my other blogpost I explained how you could solve this with notebookutils.notebook.runMultiple.

Here’s how you can enable and use this feature effectively.

Why Use High Concurrency Mode?

High Concurrency Mode offers several benefits:

  • Faster Session Start: Notebooks can attach to pre-warmed Spark sessions, reducing startup time to around 5 seconds.
  • Cost Savings: By sharing a single Spark session across multiple notebooks, you only pay for one session, which can lead to significant cost reductions.
  • Improved Efficiency: This mode optimizes pipeline execution, making it faster and more efficient.

Enabling High Concurrency Mode

To enable High Concurrency Mode in your Fabric workspace, follow these steps:

  1. Access Workspace Settings:
    • Go to your Fabric workspace and select the Workspace Settings option.
  2. Navigate to High Concurrency Settings:
    • In the settings menu, go to the Data Engineering and Science section.
    • Select Spark Compute and then High Concurrency.
  3. Enable High Concurrency:
    • In the High Concurrency section, enable the option For pipeline running multiple notebooks.
    • Save your changes.
Enable High Concurrency in Workspace

Enable High Concurrency in Workspace

Once enabled, all notebook sessions triggered by pipelines will be packed into high concurrency sessions automatically.

Using High Concurrency Mode

After enabling High Concurrency Mode, you can start using it in your pipelines:

  1. Create a Pipeline:
    • Open your Fabric workspace and create a new pipeline item from the Create menu.
  2. Add Notebook Activities:
    • Navigate to the Activities tab and add a Notebook activity to your pipeline.
    • Create Pipeline with Notebook Activity

      Create Pipeline with Notebook Activity

    • Configure Session Tags:
      • In the advanced settings of the notebook activity, specify a session tag. This tag helps group notebooks into shared sessions based on matching criteria.
    • Enable session tag on Notebook

      Enable session tag on Notebook

Session Tags

When you define a Session Tag, the Notebook will use shared sessions. These sessions tags can be used across pipelines but not across workspaces, a new session will be created even if you use the same session tag. Just see a sort of grouping. You define  a session on your own or create add dynamic content. But be aware Session tag can only contain letters, numbers, and underscores.

Monitoring

In the monitoring you will now see all the executed Notebooks one by one, while this was not the case notebookutils.notebook.runMultiple(DAG), you only saw the Main Notebook. This is a great step forwards while building monitoring solutions.

Below an overview in the Monitor before the session started:

Notebook Execution before session started

Notebook Execution before session started

Below an overview in the Monitor when the session started

Notebook Execution when session started

Notebook Execution when session started

Overview of all the executed Notebooks

Notebook Execution when session was finished

Notebook Execution when session was finished

The Notebook name is extended with the Livy id.

Remark: It looks like that currently the Snapshots from the Notebooks are incorrect because every Notebook execution is showing the Snapshots(from the first Notebook), so debugging from the Monitor is not yet possible. I’ve already created a note to the PM team.

RunMultiple

With the notebookutils.notebook.runMultiple(DAG) you have some more options.

  • Define any dependency or order among them.
  • Define timeouts per Cell
  • Run multiple notebooks in a DAG, where each notebook can depend on the output of one or more previous notebooks.

Conclusion

High Concurrency Mode for Notebooks in Pipelines with Microsoft Fabric is a powerful feature that enhances performance, reduces costs, and improves efficiency. By following the steps outlined above, you can easily enable and start using this feature to optimize your data engineering and data science workflows. Personally I’m very happy with these new functionality, you can define easier outputs for every notebook for logging purposes.

More detailed can be found on the official Fabric Blogpost

 

Feel free to leave a comment