Scale SQL Database dynamically with Metadata

Scale SQL Database dynamically with Metadata

SQL Pool

by Erwin | May 5, 2021

Scale SQL Database Dynamically with Metadata

Use this template to scale up and down an Azure SQL Database in Azure Synapse Analytics or in Azure Data Factory.

This article describes a solution template how you can Scale up or down a SQL Database within Azure Synapse Analytics or Azure Data Factory dynamically based on metadata. This is actually a necessary functionality during your Data Movement Solutions. In this way you can optimize costs and gain more performance during batch loading. The Pipeline can be added before and after your Nightly Run.

The template contains 8 activities:

  • Lookup Activity Get the necessary metadata from a table in your configuration database.
  • Until Activity to check a set of activities in a loop until the condition associated with the activity evaluates to true.
    • Web Activity activity which will check the current Status of the SQL Pool
    • Wait Activity activity which will wait before retry to check the Status of the SQL Database
  • If Condition Activity Activity to check if the SQL Database is Online
    • Web Activity Activity to Resume the SQL Database(Serverless only) if not Online
    • Wait Activity Activity to wait before to go to the next activity
  • Web Activity Activity to Scale the SQL Database up or down to the desired DatabaseLevel

Pipeline Parameters:

Parameter Value Description
WaitTime 10 Wait time in seconds before the Pipeline will finish
WaitTimeUntil 30 Wait time in seconds for the retry process
DatabaseLevel S1

The Database Service Objective Name

https://docs.microsoft.com/en-us/azure/azure-sql/database/resource-limits-vcore-single-databases

https://docs.microsoft.com/en-us/azure/azure-sql/database/resource-limits-dtu-single-databases

DatabaseName Datastore The Database Name

How to use this solution template

Create a control table in Azure SQL Database to store the Metadata.

[NOTE] > The table and stored procedure can be stored in any database, but preferred in a database where you store all your configuration in.

[sql]
CREATE TABLE [configuration].[Environment_Parameter1](
	[ParameterId] [int] IDENTITY(1,1) NOT NULL,
	[ParameterName] [varchar](128) NOT NULL,
	[ParameterValue] [nvarchar](max) NOT NULL,
	[Description] [nvarchar](max) NULL,

CONSTRAINT [PK_Environment_Parameter1] PRIMARY KEY CLUSTERED
    (
    	[ParameterId] ASC
    )WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]

INSERT [configuration].[Environment_Parameter] ( [ParameterName], [ParameterValue], [Description]) VALUES (N'yourResourceGroupName', N'', N'ResourceGroupName of your Azure Synapse or ADF Instance')
GO
INSERT [configuration].[Environment_Parameter] ( [ParameterName], [ParameterValue], [Description]) VALUES (N'SubscriptionId', N'XXXXXXXX', N'SubscriptionId of your Azure Synapse or ADF Instance')
GO
INSERT [configuration].[Environment_Parameter] ( [ParameterName], [ParameterValue], [Description]) VALUES (N'SQLServer', N'yoursqlserver', N'Name of your SQL Server( Needed for scaling databases)')
GO
[/sql]
[sql]
CREATE PROCEDURE [configuration].[Environment]
    @ColumnToPivot  NVARCHAR(255),
    @ListToPivot    NVARCHAR(max)
    AS
      /**********************************************************************************************************
    * SP Name:		[configuration].[[Environment]]
    *
    * Purpose:		Procedure display record parameters for environment Settings
    *
    *
    * Revision Date/Time:
    *  2020-12-01		Erwin de Kreuk (InSpark) - Initial creation of SP
    *
    **********************************************************************************************************/
    BEGIN

      DECLARE @SqlStatement NVARCHAR(MAX)
      SET @SqlStatement = N'
        SELECT * FROM (
          SELECT

           [ParameterName] ,
           [ParameterValue]
          FROM [configuration].[Environment_Parameter]    ) EnvironmentTable
        PIVOT
        (max([ParameterValue])
          FOR ['+@ColumnToPivot+']
          IN ('+@ListToPivot+' )    ) AS PivotTable
      ';

      EXEC(@SqlStatement)

    END
[/sql]

After you have imported the Template you will see the following:

[NOTE] > Azure Synapse has no import functionality, create a new pipeline PL_ACT_SCALE_SQLDATABASE and copy the code into the pipeline. Once the pipeline is created manualy link the correct linked service for your Metadata table

Template-Scale-SQL-Database

Create a connection to the database where your metadata tables is stored. Followed by use this template.

Lookup Activity Name = Get SQL Server Name

SQL-Database-Lookup-actvity

Source Dataset = Linked Services to your Metadata Table

Stored Procedures = configuration.environment

Parameters:

ColumnToPivot= ParameterName

ListToPivot= [ResourceGroupName],[SubscriptionId],[SQLServer]

SQL-Database-lookup-preview

Until Check DatabaseStatus

Until Activity We can only change the DatabaseLevel when the SQL Database is Paused or Online. That’s why we need to add an Until activity to check for these statusses.

Web Activity Within the Until Activity we need to create a new Web Activity.

Web-Activity

Name = Check for Database Status

URL= https://management.azure.com/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Sql/servers/XXX/databases/XXX/?api-version=2019-06-01-preview

Replace the XXX with Pipeline Parameters.

https://management.azure.com/subscriptions/@{activity('Get SQL Server Name').output.firstRow.SubscriptionId}/resourceGroups/@{activity('Get SQL Server Name').output.firstRow.ResourceGroupName}/providers/Microsoft.Sql/servers/@{activity('Get SQL Server Name').output.firstRow.SQLServer}/databases/@{pipeline().parameters.DatabaseName}/?api-version=2019-06-01-preview

Method = GET

Resource =https://management.azure.com/

After we have created the Web Activity, we can define the expression for the Until Activity.

Until-expression-SQL-Database

The Pipeline can only continue when the current status is not scaling. We can check this by comparing the currentServiceObjectiveName and the requestedServiceObjectiveName.

Expression: @equals(activity('Check for Database Status').Output.Properties.currentServiceObjectiveName,activity('Check for Database Status').Output.Properties.requestedServiceObjectiveName)

Time out: 0.00:20:00

The Until Activity will only continue, when the status from the above Web Activity output is equal, this can take a while and we don’t want to execute the Web Activity every time. That’s why we add a Wait Activity.

Wait Activity

A Wait Activity waits for the specified period of time before continuing with execution of subsequent activities.

Azure Synapse Wait Activity

Check for the SQL Database Status (Serverless Only)

If Condition Activity (Name: Check if Database is Paused). When is SQL Database is Paused, we need to Resume

Expression: @bool(startswith(activity('Check for Database Status').Output.Properties.status,'Paused'))

Web Activity In case the SQL Database is Paused we need to Resume.

URL: https://management.azure.com/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Sql/servers/XXX/databases/XXX/{Action}?api-version=2019-06-01-preview

The XXX are replaced with the output from Lookup activity.

https://management.azure.com/subscriptions/@{activity('Get SQL Server Name').output.firstRow.SubscriptionId}/resourceGroups/@{activity('Get SQL Server Name').output.firstRow.ResourceGroupName}/providers/Microsoft.Sql/servers/@{activity('Get SQL Server Name').output.firstRow.SQLServer}/databases/@{activity('Get SQL Server Name').output.firstRow.DatabaseName}/Resume?api-version=2019-06-01-preview

It is almost the same URL we used in the First Web Actvity but have to add the action option Resume.

Method = Post

Header = {“Nothing”:”Nothing”}

Resource =https://management.azure.com/

Wait Activity the purpose of this activity is to wait a period before we start ingestion data(just to be sure the SQL Database is online)

Expression: @pipeline().parameters.WaitTime

SCALE SQL Database

Web Activity "SCALE SQL Database"

SQL-Database-Scale-Header

Name = SCALE SQL Database

URL= https://management.azure.com/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.sql/servers/XXX/databases/XXX/?api-version=2019-06-01-preview

The XXX are replaced with the output from Lookup activity.

https://management.azure.com/subscriptions/@{activity('Get SQL Server Name').output.firstRow.SubscriptionId}/resourceGroups/@{activity('Get SQL Server Name').output.firstRow.ResourceGroupName}/providers/Microsoft.Sql/servers/@{activity('Get SQL Server Name').output.firstRow.SQLServer}/databases/@{pipeline().parameters.DatabaseName}/?api-version=2019-06-01-preview

Method = PATCH

Headers = Name = Content-Type Value= application/json

Body = { “sku”: { “name”: ‘@{pipeline().parameters.DatabaseLevel}’ } }

Resource =https://management.azure.com/

Important

To allow Azure Synapse Analytics or Azure Data Factory to call the REST API we need to give the Synapse/ADF access to the SQL Database/Server. In the Access control (IAM) of the SQL Server assign the SQL Contributor role to Synapse/ADF.

Role-SQL-Contributor

Debug

Select Debug, enter the Parameters,  define the correct DatabaseLevel and DatabaseName to Scale and then select Finish.

SQL-Database-pipeline-debug

When the pipeline run completes successfully, you will see the result similar to the following example:

SQL-Database-Run

You can now call this pipeline from every other pipeline, you only need to change the DatabaseLevel and DatabaseName.

You have now learned how to Scale your SQL Database Dynamically with the use of Metadata.

Please feel free to download the Pipeline code here for Azure Synapse Analytics and for here for Azure Data Factory

Hopefully this article has helped you a step further. As always, if you have any questions, leave them in the comments.

Feel free to leave a comment

Proud to be an ADF Hackathon winner!

Proud to be an ADF Hackathon winner!

SQL Pool

by Erwin | Mar 15, 2021

Azure Data Factory Hackathon

Around the clock

During the event 'Around the clock Azure SQL and Azure Data Factory' event on Feb 3, 2021, they kicked-off the Azure Data Factory (ADF) Hackathon. Recording of this event can be found here.

Winner

I submitted a ADF Pipeline Template "Scale Dedicated SQL Pool Dynamically using Azure Data Factory control flow" and my submission was marked as WINNER. I am very proud that a simple template where you can easily save costs has won. See full post of the announcement Announcing the Azure Data Factory Hackathon winners! | LinkedIn

This template will help you can to scale up and down a Dedicated SQL Pool in Azure Synapse Analytics.

The pipelines is designed to Scale a SQL Pool within Azure Synapse Analytics. A SQL Pool(Former Azure SQL DW) linked to a SQL (Logical) Server has a slightly different approach(documentation can be found on Github).

Scaling a SQL Pool is actually a necessary functionality during your Data Movement Solutions, it will help to save and optimize your costs.

ADF Synapse SQL Pool Scale

Documentation of this pipeline can be found on GitHub.

You can also use this template in Azure Synapse and the details can also be found on Github more details can also be found in this Article.

 

 

In case you have unanswered questions please do not hesitate to contact me.

Feel free to leave a comment

Scale your SQL Pool dynamically in Azure Synapse

Scale your SQL Pool dynamically in Azure Synapse

Scale your Dedicated SQL Pool in Azure Synapse Analytics

In my previous article, I explained how you can Pause and Resume your Dedicated SQL Pool with a Pipeline in Azure Synapse Analytics. In this article I will explain how to scale up and down a SQL Pool via a Pipeline in Azure Synapse Analytics. This is actually a necessary functionality during your Data Movement Solutions. In this way you can optimize costs.

The Pipeline can be added before and after your Nightly Run.

SQL Pool Performance Level

As a quick resume from the previous article, a SQL Pool can have different statuses:

  • Pausing: SQL Pool is Pausing and we cannot change the status.
  • Resuming: SQL Pool is Resuming, the SQL Pool starting and during this  process and we cannot change the status.
  • Scaling: SQL Pool is Scaling, the SQL Pool is scaling to a different compute level and during this  process we cannot change the status.
  • Paused: SQLPool is Paused, we can now change the status.
  • Online: SQLPool is Online, we can now change the status.

To allow the Synapse workspace to call the REST API we need to give the Synapse workspace access to the SQL Pool. In the Access control (IAM) of the SQL Pool assign the contributor role to your Synapse Workspace.

Build Pipeline

Clone the Pipeline PL_ACT_RESUME_SQLPOOL and rename it to PL_ACT_SCALE_SQLPOOL.

Change the description of the Pipeline, ‘Pipeline to SCALE a Synapse Dedicated SQL Pool

Synapse SQL Pool Description

Add the PerformanceLevel parameter to the Parameters of the Pipeline:

Synapse SQL Pool Scale Parameters

Action:   RESUME(Leave this on RESUME, if we want to SCALE the SQL Pool must be Online)

WaitTime: Wait time in seconds before the Pipeline will finish

WaitTimeUntil: Wait time in seconds for the retry process

Synapse_ResourceGroupName: Name of the ResourceGroup of the used Synapse Workspace

SynapseWorkspace: SynapseWorkspace

SynapseDedicatedSQLPool: Name of the dedicated SQL Pool

SubsriptionId: SubscriptionId of Synapse Workspace

PerformanceLevel:  The Database Performance level (DW100c, DW200c, DW300c, DW400c DW500c, DW1000c, DW1000c, DW1500c, DW2000c, DW2500c, DW3000c, DW5000c, DW6000c, DW7500c, DW10000c, DW15000c, DW30000c)

Web Activity

Synapse SQL Pool Scale pipeline

We leave the first two activities as is. The Pipeline can only continue when the status is Paused or Online and not one of the other statuses. When the SQL Pool is Paused, the second activity will Resume the SQL Pool.

To Scale the SQL Pool we need add a new Web Activity.

Synapse SQL Pool Scale Header

Name = SCALE SQLPOOL

URL= https://management.azure.com/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Synapse/workspaces/XXX/sqlPools/XXX/?api-version=2019-06-01-preview

The  <xxx> we need to replace with the Pipeline Parameters. The final Result will be:

https://management.azure.com/subscriptions/@{pipeline().parameters.SubscriptionID}/resourceGroups/@{pipeline().parameters.Synapse_ResourceGroupName}/providers/Microsoft.Synapse/workspaces/@{pipeline().parameters.SynapseWorkspace}/sqlPools/@{pipeline().parameters.SynapseDedicatedSQLPool}/?api-version=2019-06-01-preview

Method = PATCH

Headers =  Name = Content-Type     Value= application/json

Body = { “sku”: { “name”: ‘@{pipeline().parameters.PerformanceLevel}’ } }

Resource =https://management.azure.com/

Please feel free to download the Pipeline code here.

DAILY RUN

Add the above Pipeline as a Start Pipeline before your Daily run and Scale up to the desired Performance Level. When the Daily run is finished you Scale Down to a lower level or can you add the Pipeline to Pause the SQL Pool.

Synapse SQL Pool Scale RUN1

Metadata

If you’re already using a database where you store your Meta Data, you can create a table where you store the desired Performance Level The only thing you need to do is adding a Lookup Activity to get the parameters from your database and replace the parameters with the output from the lookup activity.

[sql]
CREATE TABLE [configuration].[Database_Level](
[Id] [int] IDENTITY(1,1) NOT NULL,
[DatabaseName] [varchar](30) NULL,
[DatabaseLevel] [varchar](10) NOT NULL,
[PerformanceLevel] [varchar](10) NOT NULL,
CONSTRAINT [PK_Pipeline_ExecutionLog] PRIMARY KEY CLUSTERED
(
[Id] DESC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

[/sql]

Performance Database table

Synapse SQL Pool MetaData

A SQL Pool(Former SQL DW)

A SQL Pool(Former SQL DW) linked to a SQL (Logical) Server has a slightly different approach.

Use the settings below to create a Pipeline to Scale the SQL Pool.

Action:   RESUME

WaitTime: Wait time in seconds before the Pipeline will finish

WaitTimeUntil: Wait time in seconds for the retry process

SQLServer_ResourceGroupName: Name of the ResourceGroup of the used SQL(Logical) Server

SQLServer: SQL(Logical) Server name

SQLServerDedicatedSQLPool: Name of the dedicated SQL Pool

SubsriptionId: SubscriptionId of Synapse Workspace

DatabaseTier:  The Database Performance level (DW100c, DW200c, DW300c, DW400c DW500c, DW1000c, DW1000c, DW1500c, DW2000c, DW2500c, DW3000c, DW5000c, DW6000c, DW7500c, DW10000c, DW15000c, DW30000c)

Body: { “requestedServiceObjectiveName”: { “name”: ‘@{pipeline().parameters.PerformanceLevel}’ } }

 

Hopefully this article has helped you a step further. As always, if you have any questions, leave them in the comments.

Create an Azure Synapse Analytics SQL Pool

Create an Azure Synapse Analytics SQL Pool

Adding a new SQL Pool

There are 2 options to create a SQL Pool. Go to your Synapse Workspace in de Azure Portal and add a new SQL Pool.

Create SQL Pool

Or go to the Management Tab in your Azure Synapse Workspace and add a new Pool.

SQL Pool Azure Synapse Studio

Creating a new SQL Pool

Create Azure Synapse SQL Pool

SQL Pool Name (SQL pool name should not be greater than 15 characters) and select the correct Performance Level. Be careful in selecting the Performance Level. DW 1000c will cost you  € 12,73 per hour excluding storage. My advice is to start small and scale up when necessary. More about DWc units can be found here.

Additional settingsAzure Synapse SQL Pool

If you want to use Data from an existing pool, select Backup or Restore Point. Otherwise select none

Use existing data-Backup

SQL Pool Backup

Use existing data-Restore Point

SQL Pool Restore Point 

Use existing data-None

Review Azure Synapse SQL Pool

Review your settings and your SQL Pool will be created.

Remarks

Be aware that if you already have another Azure SQL DW running, you cannot select or add it to your Azure Synapse Workspace.
Adding an SQL pool is only possible via the Azure Synapse Workspace(Azure Portal or Studio).

If you install it as a standalone component through the Azure portal, the SQL Pool cannot be added to the Azure Synapse Workspace.

Pause SQL Pool

And if you don’t use your SQL pool please PAUSE it. It will save you a lot of money.