During the event 'Around the clock Azure SQL and Azure Data Factory' event on Feb 3, 2021, they kicked-off the Azure Data Factory (ADF) Hackathon. Recording of this event can be found here.
Winner
I submitted a ADF Pipeline Template "Scale Dedicated SQL Pool Dynamically using Azure Data Factory control flow" and my submission was marked as WINNER. I am very proud that a simple template where you can easily save costs has won. See full post of the announcement Announcing the Azure Data Factory Hackathon winners! | LinkedIn
This template will help you can to scale up and down a Dedicated SQL Pool in Azure Synapse Analytics.
The pipelines is designed to Scale a SQL Pool within Azure Synapse Analytics. A SQL Pool(Former Azure SQL DW) linked to a SQL (Logical) Server has a slightly different approach(documentation can be found on Github).
Scaling a SQL Pool is actually a necessary functionality during your Data Movement Solutions, it will help to save and optimize your costs.
Documentation of this pipeline can be found on GitHub.
You can also use this template in Azure Synapse and the details can also be found on Github more details can also be found in this Article.
In case you have unanswered questions please do not hesitate to contact me.
This Saturday I've been speaking during Scottisch Summit 2021. It was my first Summit, but is was a great event, with more than 400 sessions covering the full Microsoft Stack in 7 different language English, Spanish, German, French, Italian, Portuguese and Polish. Proud that I was to able to join and to present.
Azure Data Factory
I presented a session on if there is a way that we can build our Azure Data Factory all with parameters based on MetaData?
In the beginning of my sessions the audio wasn't that well. I just double checked my uploaded recording and in there audio was fine.
This Saturday I've been speaking during DataSaturday #1 Pordenono. The first ever DataSaturday after Pass has retired. If you want to visit more Datasaturday events please visit the Data Saturdays event page.
Azure Purview
I presented a session about Azure Purview Microsoft's answer to Data Governance and Data Lineage
Scale your Dedicated SQL Pool in Azure Synapse Analytics
In my previous article, I explained how you can Pause and Resume your Dedicated SQL Pool with a Pipeline in Azure Synapse Analytics. In this article I will explain how to scale up and down a SQL Pool via a Pipeline in Azure Synapse Analytics. This is actually a necessary functionality during your Data Movement Solutions. In this way you can optimize costs.
The Pipeline can be added before and after your Nightly Run.
As a quick resume from the previous article, a SQL Pool can have different statuses:
Pausing: SQL Pool is Pausing and we cannot change the status.
Resuming: SQL Pool is Resuming, the SQL Pool starting and during this process and we cannot change the status.
Scaling: SQL Pool is Scaling, the SQL Pool is scaling to a different compute level and during this process we cannot change the status.
Paused: SQLPool is Paused, we can now change the status.
Online: SQLPool is Online, we can now change the status.
To allow the Synapse workspace to call the REST API we need to give the Synapse workspace access to the SQL Pool. In the Access control (IAM) of the SQL Pool assign the contributor role to your Synapse Workspace.
Build Pipeline
Clone the Pipeline PL_ACT_RESUME_SQLPOOL and rename it to PL_ACT_SCALE_SQLPOOL.
Change the description of the Pipeline, ‘Pipeline to SCALE a Synapse Dedicated SQL Pool‘
Add the PerformanceLevel parameter to the Parameters of the Pipeline:
Action: RESUME(Leave this on RESUME, if we want to SCALE the SQL Pool must be Online)
WaitTime: Wait time in seconds before the Pipeline will finish
WaitTimeUntil: Wait time in seconds for the retry process
Synapse_ResourceGroupName: Name of the ResourceGroup of the used Synapse Workspace
SynapseWorkspace: SynapseWorkspace
SynapseDedicatedSQLPool: Name of the dedicated SQL Pool
SubsriptionId: SubscriptionId of Synapse Workspace
We leave the first two activities as is. The Pipeline can only continue when the status is Paused or Online and not one of the other statuses. When the SQL Pool is Paused, the second activity will Resume the SQL Pool.
To Scale the SQL Pool we need add a new Web Activity.
Headers = Name = Content-Type Value= application/json
Body = { “sku”: { “name”: ‘@{pipeline().parameters.PerformanceLevel}’ } }
Resource =https://management.azure.com/
Please feel free to download the Pipeline code here.
DAILY RUN
Add the above Pipeline as a Start Pipeline before your Daily run and Scale up to the desired Performance Level. When the Daily run is finished you Scale Down to a lower level or can you add the Pipeline to Pause the SQL Pool.
Metadata
If you’re already using a database where you store your Meta Data, you can create a table where you store the desired Performance Level The only thing you need to do is adding a Lookup Activity to get the parameters from your database and replace the parameters with the output from the lookup activity.
[sql]
CREATE TABLE [configuration].[Database_Level](
[Id] [int] IDENTITY(1,1) NOT NULL,
[DatabaseName] [varchar](30) NULL,
[DatabaseLevel] [varchar](10) NOT NULL,
[PerformanceLevel] [varchar](10) NOT NULL,
CONSTRAINT [PK_Pipeline_ExecutionLog] PRIMARY KEY CLUSTERED
(
[Id] DESC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
[/sql]
A SQL Pool(Former SQL DW)
A SQL Pool(Former SQL DW) linked to a SQL (Logical) Server has a slightly different approach.
Use the settings below to create a Pipeline to Scale the SQL Pool.
Action: RESUME
WaitTime: Wait time in seconds before the Pipeline will finish
WaitTimeUntil: Wait time in seconds for the retry process
SQLServer_ResourceGroupName: Name of the ResourceGroup of the used SQL(Logical) Server
SQLServer: SQL(Logical) Server name
SQLServerDedicatedSQLPool: Name of the dedicated SQL Pool
SubsriptionId: SubscriptionId of Synapse Workspace