Speaking(Virtual) at SQL Saturday #963 Denmark

by Erwin | Sep 10, 2020 | Events, SQL Saturday

SQL Saturday 963 Denmark

PASS SQLSaturday is a free training event for professionals who use the Microsoft data platform. These community events offer content across data management, cloud and hybrid architecture, analytics, business intelligence, AI, and more.

My first virtual event

I like to interact during my session, so I’m curious if that will work. Last week I recorded my session for SQL Bits and that is quite strange when you look back. I am really looking forward to it, my session starts at 14:30.

The complete schedule can be found here. Are there is still time to register!!

My Session

Session Title:

Azure Key Vault, Azure Dev Ops and Data Factory how do these Azure Services work perfectly together!

Session Details

Can we store our Connectionstrings or BlobStorageKeys or other Secretvalues somewhere else then in Azure Data Factory(ADF)? Yes you can! You can store these valuable secrets in Azure Key Vault(AKV). But how can we achieve this in ADF? And finally how do we deploy our DataFactories in Azure Dev Ops to Test, Acceptance and Production environments with these Secrets ? Can this be setup dynamically? During this session I will give answers on all of these questions. You will learn how to setup your Azure Key Vault, connect these secrets in ADF and finally deploy these secrets dynamically in Azure Dev Ops. As you can see a lot to talk about during this session.

Do I see you on Saturday 26th of September?

Speaking(Virtual) at SQL Saturday #963 Denmark

SQL Saturday 963 Denmark PASS SQLSaturday is a free training event for professionals who use the Microsoft data platform. These community events offer content across data management, cloud and hybrid architecture, analytics, business intelligence, AI, and more. My…

My Virtual session DataWeekender 4.2

DataWeekender 4.2 This Saturday I’ve joined the Van and Spoke at DataWeekender Azure Purview I presented a session on Azure Purview Microsoft’s answer to Data Governance and Data Lineage You can find my slides below on Slideshare: Data weekender4.2 azure purview erwin…

Data:Scotland

Data: Scotland 2022 Microsoft Purview Scotland’s Data Community Conference happened this year again in Glasgow. This years event was happening in a sunny Glasgow, more then 400 attendees and more then 50 sessions. It was great to see so many people live again. I…

My Virtual Session Scottish Summit 2021

Scottisch Summit 2021 This Saturday I’ve been speaking during Scottisch Summit 2021. It was my first Summit, but is was a great event, with more than 400 sessions covering the full Microsoft Stack in 7 different language English, Spanish, German, French, Italian,…

Data Toboggan January 2022 (Video)

Recording of my session during the Data Toboggan January 2022 conference. Dealing with different roles in Azure Synapse Analytics

My First Blog Post SQLSatHolland 790

Yeah, m y first blog is LIVE After a good talk with Reza Rad from RADACAD during SQLSatHolland, I decided to start my first blog! Knowledge sharing is very important, it gives me a lot of energy. But it also gives others people in the community energies to pick up new…

Techoroma NL 2019

Azure Key Vault, Azure Dev Ops and Azure Data Factory how do these Azure Services work perfectly together! Some impressions during Techorama NL 2019.

My Session at TechoramaNL 2019

TechoramaNL 2019 Date: 2 th October Location: Pathé Ede From September 30 th to October 2 nd TechoramaNL was held in the Netherlands for the 2nd time. TechoramaNL is the largest deep knowledge IT conference in the Netherlands with 1500 participants and 120 sessions in…

My Virtual session at Data Toboggan

An inaugural event specializing on Azure Synapse Analytics Data Toboggan This Saturday I’ve been speaking during Data Toboggan an inaugural event specializing on Azure Synapse Analytics. 12 Hours of sessions with amazing speakers. Azure Purview I presented a session…

Data Saturday Stockholm

Data Saturday Stockholm Bra så alla tillbaka personligen This means great so everyone back in person. Last weekend I had the honor to speak at Data Saturday in Stockholm. About one of my favourite topics Azure Synapse Analytics. It was nice to see everyone in person…

Parameterize Linked Services in ADF

by Erwin | Jul 9, 2020 | Azure, Azure Data Factory, Azure Synapse Analytics

Azure

by Erwin | Jul 9, 2020

Parameterize Linked Services

For my Azure Data Factory solution I wanted to Parameterize properties in my Linked Services. Not all properties are Parameterized by default through the UI. But there's another way to achieve this.

Linked Service

Open your existing Linked Services.

In this situation I want to Parameterize my FTP connection so that I can change the Host name based on a Azure Key Vault Secret.

By default is this not possible through the UI but in the Bottom of your Linked Service there is a Advanced box

If you enable this box you can start building your own connection, but also create your own Parameters for this connection.

How to start:

As a base we will use the default code or our connection

{
    "name": "LS_FTP_SOURCE",
    "properties": {
        "annotations": [
            "stage: none",
            "scenario: demo",
            "environments: dvlm"
        ],
        "type": "FtpServer",
        "typeProperties": {
            "host": "ftp.erwindekreuk.com",
            "port": 21,
            "enableSsl": true,
            "enableServerCertificateValidation": true,
            "authenticationType": "Basic",
            "userName": "ftp_down",
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": "secretname"
            }
        }
    },
    "type": "Microsoft.DataFactory/factories/linkedservices"
}

Copy this code to your advanced box and enable the option Specify dynamic contents in JSON format/

Now you can start adding new parameters.

If you want to Parameterize your HOST name connection you have to add in the top of the code a new Parameter, under the type of your connection

    "properties": {
        "type": "FtpServer",
        "parameters": {
            "ConnectionKeyvaultSecret": {
                "type": "string"
            }

After you have done this, you need to specify for which properties you want to use this parameter. In my case I want to read the parameter form my Azure Key Vault for my HOST propertie.

The JSON code below will now use above parameter as an input.

        "typeProperties": {
            "host": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().ConnectionKeyvaultSecret",
                    "type": "Expression"
                }
            }

Save your connection and you will see that your UI is changed and that you have to define all your setting through the Advanced Editor.

If you test your connection you will now see that you have to fill in a parameter.

And now you can create parameters of every TypeProperties within your connection.

The code below will create Parameters for your Host, Username and Password entries with Azure Key Vault enabled. For the authenticationType you have to choose between Basic and Anonymous. But can also at this to your Azure Key Vault.

{
    "name": "LS_FTP_SOURCE",
    "properties": {
        "type": "FtpServer",
        "parameters": {
            "ConnectionKeyvaultSecret": {
                "type": "string"
            },
            "UsernameKeyvaultSecret": {
                "type": "string"
            },
            "PasswordKeyvaultSecret": {
                "type": "string"
            },
            "authenticationType": {
                "type": "string"
            }
        },
        "annotations": [ ],
        "typeProperties": {
            "host": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().ConnectionKeyvaultSecret",
                    "type": "Expression"
                }
            },
            "port": 21,
            "enableSsl": false,
            "enableServerCertificateValidation": false,
            "authenticationType": "@linkedService().authenticationType",
            "userName": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().UsernameKeyvaultSecret",
                    "type": "Expression"
                }
            },
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().PasswordKeyvaultSecret",
                    "type": "Expression"
                }
            }
        }
    }
}

Thanks for reading my blog post and have fun with Parameterization of your Linked Services in ADF.

Latest FMD Releases

Feel free to leave a comment

Azure Data Factory and Azure Synapse Analytics Naming Conventions

by Erwin | Jul 5, 2020 | Azure, Azure Data Factory, Azure Synapse Analytics

Naming Conventions

More and more projects are using Azure Data Factory and Azure Synapse Analytics, the more important it is to apply a correct and standard naming convention. When using standard naming conventions you create recognizable results across different projects, but you also create clarity for your colleagues. In addition to that, it is easier to add these projects to other services such as Managed Services, Azure DevOps, etc etc, because standards are used.

To start with these naming conventions, I have made a list of suggestions with most common Linked Services. The list is not exhaustive, but it does provide guidance for new Linked Services.

There are a few standard naming conventions that apply to all elements in Azure Data Factory and in Azure Synapse Analytics.

*Names are case insensitive (not case sensitive). For that reason I’m only using CAPITALS.
*Maximum number of characters in a table name: 260.
All object names must begin with a letter, number or underscore (_).
Following characters are not allowed: “.”, “+”, “?”, “/”, “<”, ”>”,”*”,”%”,”&”,”:”,””

These rules are also defined on the following link

This post has been updated on Feb 2nd, 2023 with the latest connectors.

Azure
	Abbreviation	Linked Service	Dataset
Azure Blob Storage	ABLB_	LS_ABLB_	DS_ABLB_
Azure Cosmos DB SQL API	ACSA_	LS_ACSA_	DS_ACSA_
Azure Cosmos DB MongDB API	ACMA_	LS_ACMA_	DS_ACMA_
Azure Data Explorer	ADEX_	LS_ADEX_	DS_ADEX_
Azure Data Lake Storage Gen1	ADLS_	LS_ADLS_	DS_ADLS_
Azure Data Lake Storage Gen2	ADLS_	LS_ADLS_	DS_ADLS_
Azure Database for MariaDB	AMDB_	LS_AMDB_	DS_AMDB_
Azure Database for MySQL	AMYS_	LS_AMYS_	DS_AMYS_
Azure Database for PostgreSQL	APOS_	LS_APOS_	DS_APOS_
Azure File Storage	AFIL_	LS_AFIL_	DS_AFIL_
Azure Search	ASER_	LS_ASER_	DS_ASER_
Azure SQL Database	ASQL_	LS_ASQL_	DS_ASQL_
Azure SQL Database Managed Instance	ASQM_	LS_ASQM_	DS_ASQM_
Azure Synapse Analytics (formerly Azure SQL DW)	ASDW_	LS_ASDW_	DS_ASDW_
Azure Table Storage	ATBL_	LS_ATBL_	DS_ATBL_
Azure DataBricks	ADBR_	LS_ADBR_	DS_ADBR_
Azure Cognitive Search	ACGS_	LS_ACGS	DS_ACGS_
Azure Synapse Analytics	ASA_	LS_ASA	DS_ASA
Azure Cognitive Service	ACG_	LS_ACG_	N/A

Database
	Abbreviation	Linked Service	Dataset
SQL Server	MSQL_	LS_SQL_	DS_SQL_
Oracle	ORAC_	LS_ORAC_	DS_ORAC_
Oracle Eloqua	ORAE_	LS_ORAE_	DS_ORAE_
Oracle Responsys	ORAR_	LS_ORAR_	DS_ORAR_
Oracle Service Cloud	ORSC_	LS_ORSC_	DS_ORSC_
MySQL	MYSQ_	LS_MYSQ_	DS_MYSQ_
DB2	DB2_	LS_DB2_	DS_DB2_
Teradata	TDAT_	LS_TDAT_	DS_TDAT_
PostgreSQL	POST_	LS_POST_	DS_POST_
Sybase	SYBA_	LS_SYBA_	DS_SYBA_
Cassandra	CASS_	LS_CASS_	DS_CASS_
MongoDB	MONG_	LS_MONG_	DS_MONG_
Amazon Redshift	ARED_	LS_ARED_	DS_ARED_
SAP Business Warehouse	SAPW_	LS_SAPW_	DS_SAPW_
SAP Cloud for Customer (C4C)	SAPC_	LS_SAPC_	DS_SAPC_
SAP Table	SAPT_	LS_SAPT	DS_SAPT_
SAP HANA	HANA_	LS_HANA_	DS_HANA_
Drill	DRILL_	LS_DRILL_	DS_DRILL_
Google BigQuery	GBQ_	LS_GBQ_	DS_GBQ_
Greenplum	GRPL_	LS_GRPL_	DS_GRPL_
HBase	HBAS_	LS_HBAS_	DS_HBAS_
Hive	HIVE_	LS_HIVE_	DS_HIVE_
Apache Impala	IMPA_	LS_IMPA_	DS_IMPA_
Informix	INMI_	LS_INMI_	DS_INMI_
MariaDB	MDB_	LS_MDB_	DS_MDB_
Microsoft Access	MACS_	LS_MACS_	DS_MACS_
Netezza	NETZ_	LS_NETZ_	DS_NETZ_
Phoenix	PHNX_	LS_PHNX_	DS_PHNX_
Presto (Preview)	PRST_	LS_PRST_	DS_PRST_
Spark	SPRK_	LS_SPRK_	DS_SPRK_
Vertica	VERT_	LS_VERT_	DS_VERT_
Snowflake	SNWF_	LS_SNWF_	DS_SNWF_
MongoDB Atlas	MONG_ATLAS_	LS_MONG_ATLAS_	DS_MONG_ATLAS_
Amazon RDS for Oracle	RDSORAC_	LS_RDSORAC_	DS_RDSORAC_
Amazon RDS for SQL Server	RDSSQL_	LS_RDSSQL_	DS_RDSSQL_

Files
	Abbreviation	Linked Service	Dataset
File System	FILE_	LS_FILE_	DS_FILE_
HDFS	HDFS_	LS_HDFS_	DS_HDFS_
Amazon S3	AMS3_	LS_AMS3_	DS_AMS3_
FTP	FTP_	LS_FTP_	DS_FTP_
SFTP	SFTP_	LS_SFTP_	DS_SFTP_
Google Cloud Storage	GCS_	LS_GCS_	DS_GCS_
Oracle Cloud Storage	OCS_	LS_OCS_	DS_OCS_
Amazon S3 Compatible Storage	SMS3C_	LS_SMS3C_	DS_SMS3C_

Generic
	Abbreviation	Linked Service	Dataset
Generic ODBC	ODBC_	LS_ODBC_	DS_ODBC_
Generic OData	ODAT_	LS_ODAT_	DS_ODAT_
Generic REST	REST_	LS_REST_	DS_REST_
Generic HTTP	HTTP_	LS_HTTP_	DS_HTTP_

Services and Apps
	Abbreviation	Linked Service	Dataset
Salesforce	SAFC_	LS_SAFC_	DS_SAFC_
Salesforce Service Cloud	SAFCSC_	LS_SAFCSC_	DS_SAFCSC_
Salesforce Marketing Cloud	SAFOMC_	LS_SAFOMC_	DS_SAFOMC_
GitHub	GITH_	LS_GITH_	DS_GITH_
Jira	JIRA_	LS_JIRA_	DS_JIRA_
Web Table (table from HTML)	WEBT_	LS_WEBT_	DS_WEBT_
Amazon Marketplace Web Service	AMSMWS_	LS_AMSMWS_	DS_AMSMWS_
Xero	XERO_	LS_XERO_	DS_XERO_
SharePoint Online List	SHAREPOINT_	LS_SHAREPOINT_	DS_SHAREPOINT_
ServiceNow	SERVICENOW_	LS_SERVICENOW_	DS_SERVICENOW_
Dynamics (Microsoft Dataverse)	DATAVERSE_	LS_DATAVERSE_	DS_DATAVERSE__
Dynamics 365	D365_	LS_D365_	DS_D365_
Dynamics AX	DAX_	LS_DAX_	DS_DAX_
Dynamics CRM	DCRM_	LS_DCRM_	DS_DCRM_
Microsoft 365	M365_	LS_M365_	Ds_M365__
SAP Cloud for Customer (C4C)	SAPC4C_	LS_SAPC4C_	DS_LS_SAPC4C_
SAP ECC	SAPE_	LS_SAPE_	DS_SAPE_

If your connector is not described(mostly connectors which are in Preview), please let me know. For more details for all the different connectors, check the connector overview

Pipeline

Even for Pipeline you can define naming conventions. I think the most important thing is that you always start your pipeline with PL_ followed by a Logic Name for you. You can for example use:

TRANS: Pipeline with transformations

SSIS: Pipeline with SSIS Packages

DATA: Pipeline with DataMovements

COPY: Pipeline with Copy Activities

Divers

NB: Notebook

DF: Mapping Dataflows

SQL: SQL Scripts

KQL: KQL Scripts

JOB: Spark job definition

Once again these naming conventions are just suggestions. The most important thing is that you start using naming conventions and that you use the folder structure within the Pipelines (categories). Like the picture below as an example.

If you have suggestions just let me know by leaving a comment below.

Exploring Azure Synapse Analytics Studio

by Erwin | Jun 16, 2020 | Azure Synapse Analytics

Azure Synapse Workspace Settings

In my previous article, I walked you through “how to create your Azure Synapse Analytics Workspace”. It’s now time to explore the brand new Synapse Studio.

Most configuration and settings can be done through the Synapse Studio. In your Workspace you need to set the SQL Active Directory Admin, like you have to do for a Logical Server.

SQL Active Directory Admin

Firewall

Change the IP-address to you own IP-address or to one of your employer if you work from the office. Make sure that you enable the option “Allow Azure service and Resources to access this Workspace” is enabled. Every trusted Azure Service or Resource can connect to this Workspace. Not all Public Preview Azure Services or Resources are Trusted yet.

Private Endpoint Connections

Define your Private Endpoint Connections to your Services to use a Private IP-address form your Virtual Network. More details on how to setup a Private Endpoint Connection can be found here.

Launch Azure Synapse Analytics Studio

After you have opened the portal, the following screen will appear. Personally I am very charmed of this brand new Portal, you now get 1 place where you can access all your data. But also an integration with your Power BI reporting. But more about that later. Let’s walk through each tab below.

Data

The data Tab is Divided into 2 different parts Linked and Workspace

Linked

All Dataset likes you’re used to create in Azure Data Factory are stored here.

Your can now directly browse files within your Azure Data Lake Storage.

But you can also connect to External Data.

Workspace

Here you will find all your databases which you have created with Sparks, SQL on Demand or SQL Pools. How to create these database I will explain later in another article.

Develop

The Develop tab, is the location, where your SQL Scripts, Notebooks, Dataflows, Spark Job Definitions and Power BI Reports are stored. In a later stage you can commit your work to Azure Dev Ops or GitHub.

Orchestrate

What do we see here , nothing more then you were used to see in Azure Data Factory except the addition of the Synapse Activities.

Monitor

In the monitor tab you will find similar things to ADF except for the SQL requests and the Apache Spark Applications

Orchestration:	Pipeline runs	Overview of all Executed Pipelines
	Trigger Runs	Overview of all Executed Trigger Runs
	Integration runtimes	Overview of all created Integration Runtimes
Activities:	Apache Spark applications	Monitor your Apache Spark Executions
	SQL requests	Monitor your SQL on Demand or SQL Pool queries

SQL request

All running(currently) SQl request for your SQL on Demand and your SQL Pools

Apache Spark Applications

All Apache Spark request.

A detailed explanation of the Apache Spark Application monitor can be found here.

Management HUB

The Synapse Analytics Management HUB offers the following options:

Analytic Pools:	SQL pools	Here you can manage(Scale up or Down) your previously created SQL pools or create new ones.
	Apache Spark pools	Create multiple instances of Spark pools depending on the workload requirements. Once you have your Instance created you also change Auto Scaling, Node Size and Number of Nodes from here.
External Connections:	Linked Services	Create and manage connections to different services, same as in Azure Data Factory
Orchestration:	Triggers	Create and manage Triggers for your Pipelines, same as in Azure Data Factory
	Integration runtimes	Create and manage your different types of Integration Runtime: Azure: execute workloads between Azure services or Azure Data Factory Mapping Data Flows. Self-Hosted: execute workloads between on-premises environments and Azure. Azure-SSIS: execute SQL Server Integration Services packages in Azure Data Factory.
Security:	Access control	Azure Synapse Analytics comes with role-based access control. The available roles: Workspace admin Apache Spark admin SQL admin
	Managed Private Endpoint	Private link enables you to access Azure services from your Azure VNet securely. More details on Managed Private Endpoints can be found here

At the beginning of this article, I indicated that I was very charmed of this new Portal. Microsoft has ensured that we can now approach almost all our services seamlessly from 1 Portal and there will be much more to come in the future. And as you can see you don’t have to use the SQL Pools which is a reasonably expensive solution for customers who use a lot of data. But you can use Azure Synapse Analytics for almost every customer. I would say if you haven’t started trying Azure Synapse Analytics yet, start today and see how you can help your customers with this. If there are any questions I would like to hear them.

Thank you for reading.

Azure Synapse Analytics

by Erwin | Jun 15, 2020 | Azure, Azure Synapse Analytics

Azure

by Erwin | Jun 15, 2020

Azure Synapse Analytics

Insights for all

Azure Synapse provides a breathtaking view of your data across data warehouses and big data analytics systems. Bringing these two worlds together into a single service is challenging as it requires unifying similar concepts that operate differently in each world such as security, privacy, and performance. With Azure Synapse, this seamless unification of data warehousing and big data not only simplifies a business’s analytics platform, but also breaks down silos that exist today because of teams, data, and skills. (source Azure blog)

Azure Synapse Analytics Workspace

During Ignite 2019 we already saw the first announcement about Azure Synapse Analytics. The first Public Preview was announced during Build 2020.

Immediately after Build 2020, I started playing and exploring with Azure Synapse Analytics Workspace.
Fortunately, I was off for a few days and was able to use this free time to dive a little bit into Azure Synapse.

A few days later during the Analytics in a Day workshops that I gave for my employer InSpark in collaboration with Microsoft, I immediately took the time to give a Live demo. I found the inspiration for this Live demo during a YouTube session presented by Simon Whiteley.

For many participants it is more imaginative, to walk through the product Live than to tell a story via PowerPoint Slides.

Upcoming Articles

In the coming days I will try to write a number of articles so that you become more familiar with the various possibilities of Azure Synapse Analytics.

I have the following articles in mind:

✅ Creating your Azure Synapse Analytics Workspace

✅ Exploring the new Azure Synapse Analytics Studio

✅ Creating an Apache Spark Pool

✅ Creating a SQL Pool

✅ Integration with Power BI

And if you have more subject which needs to be explained feel free to leave them in the comments.

Happy reading!

Latest FMD Releases

Feel free to leave a comment

« Older Entries

Next Entries »

SQL Saturday 963 Denmark

PASS SQLSaturday is a free training event for professionals who use the Microsoft data platform. These community events offer content across data management, cloud and hybrid architecture, analytics, business intelligence, AI, and more.

My first virtual event

My Session

Azure

Parameterize Linked Services

For my Azure Data Factory solution I wanted to Parameterize properties in my Linked Services. Not all properties are Parameterized by default through the UI. But there's another way to achieve this.

Linked Service

Latest FMD Releases

Feel free to leave a comment

Azure

Database

Files

Generic

Services and Apps

Pipeline

Divers

Azure Synapse Workspace Settings

SQL Active Directory Admin

Firewall

Private Endpoint Connections

Launch Azure Synapse Analytics Studio

Data

Linked

Workspace

Develop

Orchestrate

Monitor

SQL request

Apache Spark Applications

Management HUB

Azure

Azure Synapse Analytics

Insights for all

Azure Synapse Analytics Workspace

Upcoming Articles

Latest FMD Releases

Feel free to leave a comment

Categories