Azure Synapse Analytics

Azure Synapse Analytics

Data Factory

by Erwin | Jun 15, 2020

Azure Synapse Analytics

 

Insights for all

Azure Synapse provides a breathtaking view of your data across data warehouses and big data analytics systems. Bringing these two worlds together into a single service is challenging as it requires unifying similar concepts that operate differently in each world such as security, privacy, and performance. With Azure Synapse, this seamless unification of data warehousing and big data not only simplifies a business’s analytics platform, but also breaks down silos that exist today because of teams, data, and skills. (source Azure blog)

Azure Synapse Analytics Workspace

During Ignite 2019 we already saw the first announcement about Azure Synapse Analytics. The first Public Preview was announced during Build 2020.

Immediately after Build 2020, I started playing and exploring with Azure Synapse Analytics Workspace.
Fortunately, I was off for a few days and was able to use this free time to dive a little bit into Azure Synapse.

A few days later during the Analytics in a Day workshops that I gave for my employer InSpark in collaboration with Microsoft, I immediately took the time to give a Live demo. I found the inspiration for this Live demo during a YouTube session presented by Simon Whiteley.

For many participants it is more imaginative,  to walk through the product Live than to tell a story via PowerPoint Slides.

Upcoming Articles

In the coming days I will try to write a number of articles so that you become more familiar with the various possibilities of Azure Synapse Analytics.

I have the following articles in mind:

✅ Creating your Azure Synapse Analytics Workspace

✅ Exploring the new Azure Synapse Analytics Studio

Creating an Apache Spark Pool

Creating a SQL Pool

Integration with Power BI

And if you have more subject which needs to be explained feel free to leave them in the comments.

Happy reading!

 

 

Feel free to leave a comment

Azure Data Factory: New functionalities and features

Azure Data Factory: New functionalities and features

Data Factory

by Erwin | May 22, 2020

New functionalities and features

Last week, a number of great new functionalities and features were added within Azure Data Factory. I would like to take you in some details in the blog below:

Customer key

With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key (BYOK) you can add extra security to your Azure Data Factory environment. If you use the customer-managed key functionality, the data will be encrypted in combination with the ADF system key. You can create your own key or have it generated by the Azure Key Vault API.

You can read more in this Article which I wrote.

Pipeline Consumption Report

Last week the Azure Data Factory added the Pipeline Consumption Report.

The report can be used for your Triggered runs, just go to your Triggered runs and click on the new Icon.

ADFMonitor

The consumption of the selected Pipeline will be displayed. The data shown is only from this Pipeline and not from other Pipelines fired by this Pipeline. Would be a nice addition if the report shows the aggregation of the complete Triggered Run.

For your debug run, click on right site of your Output pane:

ADF DEBUG button

ADF DEBUG Report

The ADF consumption report is only surfacing Azure Data Factory related units. There may be additional units billed from other services that you are using and accessing which are not accounted for here including Azure SQL Database, Synapse Analytics, CosmosDB, ADLS, etc. More detailed can be found here.

Parameters from Execute Pipeline Activity

When calling a Pipeline you first had to add the parameters yourself, now they are automatically taken over from the Pipeline you select. Very handy and saves time again if you use a lot of parameters.

Define a Parameter in one of your Pipelines:

ADF Parameter

 

Create another Pipeline and add the Execute Pipeline activity. On the settings tab where you have to select the Pipeline you want to execute, you will discover that the option to add Manually the parameters is not there anymore. But, all the Parameters you had defined in your Pipeline are directly shown. Very handy and it reduces errors.

Old Situation:

ADF Parameter 3

New Situation:

ADF Parameter Pipeline

General Tab moved to new Properties Pane

Your General tab is now moved to the right site of the Canvas.

ADF Pane General

To edit it your properties, click on the pane icon located in the top-right corner of the canvas.

ADF Pane Properties

So these were some nice and useful addition to Azure Data Factory. I am very happy with it and what do you think?

Feel free to leave a comment

Azure Data Factory: How to assign a Customer Managed Key

Azure Data Factory: How to assign a Customer Managed Key

Customer key

With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key (BYOK) you can add extra security to your Azure Data Factory environment. If you use the customer-managed key functionality, the data will be encrypted in combination with the ADF system key. You can create your own key or have it generated by the Azure Key Vault API

Be careful,  this new feature can only be enabled on an empty Azure Data Factory environment.  Make sure your Azure Active Directory, Azure Data Factory and Azure KeyVault are all in the same region. If you use an Azure Landing Zone consisting of different subscriptions, this is also possible, as long as the services exist in the same region.

Please follow the steps below how to enable this new feature:

I assume that you already have an existing Azure KeyVault. If not, you will have to create one first. You can read how to do that here.
With an existing Azure KeyVault, it is important that you enable the options Soft Deletes and Purge protection.

Enable Soft Deletes and Purge protection

Purge option

If you want to enable this via Powershell use the following command:

[code lang="ps"] ($resource = Get-AzResource -ResourceId (Get-AzKeyVault -VaultName 'YOURKEYVAULTNAME').ResourceId).Properties | Add-Member -MemberType 'NoteProperty' -Name 'enableSoftDelete' -Value 'true'

Set-AzResource -resourceid $resource.ResourceId -Properties $resource.Properties

($resource = Get-AzResource -ResourceId (Get-AzKeyVault -VaultName 'YOURKEYVAULTNAME').ResourceId).Properties | Add-Member -MemberType 'NoteProperty' -Name 'enablePurgeProtection' -Value 'true'

Set-AzResource -resourceid $resource.ResourceId -Properties $resource.Properties

Define Access policy

The next step is to enable your Grant Data Factory access to Azure Key Vault, you have to enable  the following permissions: Get, Unwrap Key, and Wrap Key

ADF Policy

Search for Data Factory Instance and Select the correct one:

ADF Principal

Create KEY

Once you have done that it’s time to create your Keys. Keep in mind that only RSA 2048-bit keys are supported by Azure Data Factory encryption.

Create Keys

Very important step your key name must be in only letters. KEYADFNAMECUSTOMER will work, but KEY-ADFNAME-CUSTOMER isn’t and you will get an error in your Azure Data Factory Instance. It took me a while to figure this out. So it can saves you a lot of time.

After your KEY is created, copy the Key Identifier.

Assign Customer Key

The last step in this article is to assign the key to your Azure Data Factory Instance.

CMK

Customer key ADF

Paste the selected key in your Azure Data Factory Instance and save.

Errors

If your get an error “Invalid key Vault URL”

-Check if the Soft Deletes and Purge protection on your Key Vault is set.

-Check if your Key consists only of letters.

-Check if you enabled your Grant Data Factory access to Azure Key Vault.

-Check if Azure DataFactory, Azure KeyVault and your Azure Active Directory are in the same region.

 

If you still have errors, please send me a message and I will try to help you out.

Hopefully, this article has helped you to secure your environment.

SQL SERVER KONFERENZ 2020

SQL SERVER KONFERENZ 2020

SQL SERVER KONFERENZ 2020

Date: 4-5 March

Location: KongressCenter Darmstadt

 

Speaking Dinner

The speaker dinner on Tuesday evening was held in Restaurant Sitte. And yes what do you eat when you are in Germany a delicious Schnitzel with baked potatoes. It was a nice evening that we closed in the bar of the Hotel.

Event

The event started on Wednesday with a Keynote in which many speakers participated. After the Keynote it was time for the sessions. The agenda was very well filled with 30 sessions on Wednesday and 36 sessions on Thursday, 6 tracks at the same time and a high diversity of different topics (mix of English and German sessions). Both days lunch was taken care of to perfection. The evening ended on Wednesday with a Monster Party.

Thank you

Thanks again to the organisation and the crew, and of course all sponsors because without sponsors it is almost impossible to organise such a big event. As usual, it was another top event and I am already looking forward to next year.

The presentations of SQL SERVER KONFERENZ 2020 and sample code can be found here.

DataSaturday NL 2019

Recording of my session on DataSaturdayNL 2019 Can we store our Connectionstrings or BlobStorageKeys or other Secretvalues somewhere else then in Azure Data Factory(ADF)? Yes you can! You can store these valuable secrets in Azure Key Vault(AKV). But how can we achieve…

Speaking at SQLBits in London (postponed to September 2020)

SQLBits 2020 SQLBits is the largest Microsoft Data Platform conference in Europe taking place between 29nd September and 3rd October2020 at the Excel London. Proud to be speaking I am very proud and happy that one of my sessions was selected for SQLBits. It’s not the…

My virtual session at Data Toboggan

Data Toboggan This Saturday I’ve joined the Data Toboggan to talk about Azure Synapse Analytics.   Azure Synapse Analytics Today I’ve been talking on how to deal with all the different roles in Azure Synapse Analytics during Data Toboggan. An event 100% focussed…

Speaking at SQL Server Konferenz 2020

SQL Server Konferenz  SQL Server Konferenz 2020 is a international Conference which is held each year in Darmstadt, Germany. Trilled to be speaking This year I will speak and attend SQL Server Konferenz 2020 from March 3th until March 5th. I was very happy when I…

Watch the MS Ignite sessions on-demand

MS Ignite Sessions MS Ignite 2020 was this year a virtual event. Most of the sessions were live in the evenings and the other sessions were available at different times in different time zones. Compliments to the MS Ignite team for organizing such a great event Most…

Datagrillen: Data, Bratwurst und Beer

Data, Bratwurst and Beer An event that started 5 years ago as a small event, that has grown into an event with 200 participants, 50+ sessions, 5 tracks, 2 days and a BBQ in a small place in Germany, called Lingen. William Durkin and Ben Weissman are the main…

SQLBits 2020(Video)

All sessions of SQLBits 2020 have been made available to everyone and can now be viewed via their Youtube channel. To make it easier to find a recording of your choice, a number of playlists have been created: Developer SQLBits 2020 – Developer sessions High…

My Virtual Session DataSaturday #14 Oslo

DATA SATURDAY #14 OSLO This Saturday I’ve been speaking during DataSaturday #4 Oslo. If you want to visit more Datasaturday events please visit the Data Saturdays event page. Azure Purview I presented a session on Azure Purview Microsoft’s answer to Data Governance…

DataGrillen 2022

DataGrillen 2022 Microsoft Purview When we say: Data, bratwurst and beer, we are of course talking about DataGrillen. After more than 2 years of absence, it was time again in recent days, with speakers from all over the world with almost 50 sessions, good weather and…

My Virtual session DataWeekender 4.2

DataWeekender 4.2 This Saturday I’ve joined the Van and Spoke at DataWeekender Azure Purview I presented a session on Azure Purview Microsoft’s answer to Data Governance and Data Lineage You can find my slides below on Slideshare: Data weekender4.2 azure purview erwin…

Speaking at SQLBits in London (postponed to September 2020)

Speaking at SQLBits in London (postponed to September 2020)

SQLBits 2020

SQLBits is the largest Microsoft Data Platform conference in Europe taking place between 29nd September and 3rd October2020 at the Excel London.

Proud to be speaking

I am very proud and happy that one of my sessions was selected for SQLBits. It’s not the first time I will go to SQLBits, but it is the first time I can speak there. Super cool and I am also very proud of it. Great to see so many known people, but also to get to know new people from the #sqlfamily. I will be speaking on Friday October 2nd

The complete schedule can be found here. Tickets are still available here

My Session

Session Title:

Azure Key Vault, Azure Dev Ops and Data Factory how do these Azure Services work perfectly together!

Session Details

Can we store our Connectionstrings or BlobStorageKeys or other Secretvalues somewhere else then in Azure Data Factory(ADF)? Yes you can! You can store these valuable secrets in Azure Key Vault(AKV). But how can we achieve this in ADF? And finally how do we deploy our DataFactories in Azure Dev Ops to Test, Acceptance and Production environments with these Secrets ? Can this be setup dynamically? During this session I will give answers on all of these questions. You will learn how to setup your Azure Key Vault, connect these secrets in ADF and finally deploy these secrets dynamically in Azure Dev Ops. As you can see a lot to talk about during this session.

Do I see you in ExCel London?

 

 

DataGrillen 2022

DataGrillen 2022 Microsoft Purview When we say: Data, bratwurst and beer, we are of course talking about DataGrillen. After more than 2 years of absence, it was time again in recent days, with speakers from all over the world with almost 50 sessions, good weather and…

Speaking at SQL BITS 2022

SQL BITS 2022 We’re Hitting the Arcade SQL Bits is back this year in London from March 8-12 2022. SQLBits is the largest data conference in the world and this year’s theme is to bring us back to our incandescent youth, so prepare to level up your data skills and reach…

Speaking at Techorama NL

Techorama = Deep Knowledge IT Conference Techorama is a yearly international technology conference which takes place at Pathé Ede, Netherlands from September 30th until October 2nd. This will be the second year the event will be in held in the Netherlands. Last year…

My Virtual session DataWeekender 4.2

DataWeekender 4.2 This Saturday I’ve joined the Van and Spoke at DataWeekender Azure Purview I presented a session on Azure Purview Microsoft’s answer to Data Governance and Data Lineage You can find my slides below on Slideshare: Data weekender4.2 azure purview erwin…

My Virtual Session Cloud Lunch and Learn Marathon

Cloud Lunch and Learn Marathon 2021 This Thursday May 13th 2021 I’ve been speaking during Cloud Lunch and Learn Marathon 2021. It was the first Cloud Lunch and Learn Marathon conference, more then 1200 registered attendees, 24hours Live and pre recorded sessions….

My Session at TechoramaNL 2019

TechoramaNL 2019 Date: 2 th October Location: Pathé Ede From September 30 th to October 2 nd TechoramaNL was held in the Netherlands for the 2nd time. TechoramaNL is the largest deep knowledge IT conference in the Netherlands with 1500 participants and 120 sessions in…

My sessions at Pass Data Community Summit

A hybrid conference in Seattle and online This year’s PASS Data Community Summit is more than a conference – it’s a homecoming. Reconnect with old friends, build new relationships, gain new skills, and get the world-class training you need to take that next step in…

My Session at DataMindsConnect 2019

DataMindsConnect 2019 Date: 7 and 8 th October Location: Lamot Mechelen Conference number 3 within 7 days and this time in Mechelen.The location for this conference is in an old beer brewery in the center of Mechelen. Datamindsconnect is the largest Dataplatform event…

SQL BITS 2022 Session recordings

Recordings SQL Bits 2022 All sessions of SQLBits 2022 have been made available to everyone and can now be viewed via their Youtube channel . Microsoft asked me to present me this session during SQL Bits in the Cloud Scale Analytics solution area. Session Title: Lake…

My virtual session at Data Toboggan

Data Toboggan This Saturday I’ve joined the Data Toboggan to talk about Azure Synapse Analytics.   Azure Synapse Analytics Today I’ve been talking on how to deal with all the different roles in Azure Synapse Analytics during Data Toboggan. An event 100% focussed…