Parameterize Linked Services in ADF

by Erwin | Jul 9, 2020 | Azure, Azure Data Factory, Azure Synapse Analytics

Azure Synapse

by Erwin | Jul 9, 2020

Parameterize Linked Services

For my Azure Data Factory solution I wanted to Parameterize properties in my Linked Services. Not all properties are Parameterized by default through the UI. But there's another way to achieve this.

Linked Service

Open your existing Linked Services.

In this situation I want to Parameterize my FTP connection so that I can change the Host name based on a Azure Key Vault Secret.

By default is this not possible through the UI but in the Bottom of your Linked Service there is a Advanced box

If you enable this box you can start building your own connection, but also create your own Parameters for this connection.

How to start:

As a base we will use the default code or our connection

{
    "name": "LS_FTP_SOURCE",
    "properties": {
        "annotations": [
            "stage: none",
            "scenario: demo",
            "environments: dvlm"
        ],
        "type": "FtpServer",
        "typeProperties": {
            "host": "ftp.erwindekreuk.com",
            "port": 21,
            "enableSsl": true,
            "enableServerCertificateValidation": true,
            "authenticationType": "Basic",
            "userName": "ftp_down",
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": "secretname"
            }
        }
    },
    "type": "Microsoft.DataFactory/factories/linkedservices"
}

Copy this code to your advanced box and enable the option Specify dynamic contents in JSON format/

Now you can start adding new parameters.

If you want to Parameterize your HOST name connection you have to add in the top of the code a new Parameter, under the type of your connection

    "properties": {
        "type": "FtpServer",
        "parameters": {
            "ConnectionKeyvaultSecret": {
                "type": "string"
            }

After you have done this, you need to specify for which properties you want to use this parameter. In my case I want to read the parameter form my Azure Key Vault for my HOST propertie.

The JSON code below will now use above parameter as an input.

        "typeProperties": {
            "host": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().ConnectionKeyvaultSecret",
                    "type": "Expression"
                }
            }

Save your connection and you will see that your UI is changed and that you have to define all your setting through the Advanced Editor.

If you test your connection you will now see that you have to fill in a parameter.

And now you can create parameters of every TypeProperties within your connection.

The code below will create Parameters for your Host, Username and Password entries with Azure Key Vault enabled. For the authenticationType you have to choose between Basic and Anonymous. But can also at this to your Azure Key Vault.

{
    "name": "LS_FTP_SOURCE",
    "properties": {
        "type": "FtpServer",
        "parameters": {
            "ConnectionKeyvaultSecret": {
                "type": "string"
            },
            "UsernameKeyvaultSecret": {
                "type": "string"
            },
            "PasswordKeyvaultSecret": {
                "type": "string"
            },
            "authenticationType": {
                "type": "string"
            }
        },
        "annotations": [ ],
        "typeProperties": {
            "host": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().ConnectionKeyvaultSecret",
                    "type": "Expression"
                }
            },
            "port": 21,
            "enableSsl": false,
            "enableServerCertificateValidation": false,
            "authenticationType": "@linkedService().authenticationType",
            "userName": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().UsernameKeyvaultSecret",
                    "type": "Expression"
                }
            },
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AKV_OXGN",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@linkedService().PasswordKeyvaultSecret",
                    "type": "Expression"
                }
            }
        }
    }
}

Thanks for reading my blog post and have fun with Parameterization of your Linked Services in ADF.

Latest FMD Releases

Feel free to leave a comment

Azure Data Factory and Azure Synapse Analytics Naming Conventions

by Erwin | Jul 5, 2020 | Azure, Azure Data Factory, Azure Synapse Analytics

Naming Conventions

More and more projects are using Azure Data Factory and Azure Synapse Analytics, the more important it is to apply a correct and standard naming convention. When using standard naming conventions you create recognizable results across different projects, but you also create clarity for your colleagues. In addition to that, it is easier to add these projects to other services such as Managed Services, Azure DevOps, etc etc, because standards are used.

To start with these naming conventions, I have made a list of suggestions with most common Linked Services. The list is not exhaustive, but it does provide guidance for new Linked Services.

There are a few standard naming conventions that apply to all elements in Azure Data Factory and in Azure Synapse Analytics.

*Names are case insensitive (not case sensitive). For that reason I’m only using CAPITALS.
*Maximum number of characters in a table name: 260.
All object names must begin with a letter, number or underscore (_).
Following characters are not allowed: “.”, “+”, “?”, “/”, “<”, ”>”,”*”,”%”,”&”,”:”,””

These rules are also defined on the following link

This post has been updated on Feb 2nd, 2023 with the latest connectors.

Azure
	Abbreviation	Linked Service	Dataset
Azure Blob Storage	ABLB_	LS_ABLB_	DS_ABLB_
Azure Cosmos DB SQL API	ACSA_	LS_ACSA_	DS_ACSA_
Azure Cosmos DB MongDB API	ACMA_	LS_ACMA_	DS_ACMA_
Azure Data Explorer	ADEX_	LS_ADEX_	DS_ADEX_
Azure Data Lake Storage Gen1	ADLS_	LS_ADLS_	DS_ADLS_
Azure Data Lake Storage Gen2	ADLS_	LS_ADLS_	DS_ADLS_
Azure Database for MariaDB	AMDB_	LS_AMDB_	DS_AMDB_
Azure Database for MySQL	AMYS_	LS_AMYS_	DS_AMYS_
Azure Database for PostgreSQL	APOS_	LS_APOS_	DS_APOS_
Azure File Storage	AFIL_	LS_AFIL_	DS_AFIL_
Azure Search	ASER_	LS_ASER_	DS_ASER_
Azure SQL Database	ASQL_	LS_ASQL_	DS_ASQL_
Azure SQL Database Managed Instance	ASQM_	LS_ASQM_	DS_ASQM_
Azure Synapse Analytics (formerly Azure SQL DW)	ASDW_	LS_ASDW_	DS_ASDW_
Azure Table Storage	ATBL_	LS_ATBL_	DS_ATBL_
Azure DataBricks	ADBR_	LS_ADBR_	DS_ADBR_
Azure Cognitive Search	ACGS_	LS_ACGS	DS_ACGS_
Azure Synapse Analytics	ASA_	LS_ASA	DS_ASA
Azure Cognitive Service	ACG_	LS_ACG_	N/A

Database
	Abbreviation	Linked Service	Dataset
SQL Server	MSQL_	LS_SQL_	DS_SQL_
Oracle	ORAC_	LS_ORAC_	DS_ORAC_
Oracle Eloqua	ORAE_	LS_ORAE_	DS_ORAE_
Oracle Responsys	ORAR_	LS_ORAR_	DS_ORAR_
Oracle Service Cloud	ORSC_	LS_ORSC_	DS_ORSC_
MySQL	MYSQ_	LS_MYSQ_	DS_MYSQ_
DB2	DB2_	LS_DB2_	DS_DB2_
Teradata	TDAT_	LS_TDAT_	DS_TDAT_
PostgreSQL	POST_	LS_POST_	DS_POST_
Sybase	SYBA_	LS_SYBA_	DS_SYBA_
Cassandra	CASS_	LS_CASS_	DS_CASS_
MongoDB	MONG_	LS_MONG_	DS_MONG_
Amazon Redshift	ARED_	LS_ARED_	DS_ARED_
SAP Business Warehouse	SAPW_	LS_SAPW_	DS_SAPW_
SAP Cloud for Customer (C4C)	SAPC_	LS_SAPC_	DS_SAPC_
SAP Table	SAPT_	LS_SAPT	DS_SAPT_
SAP HANA	HANA_	LS_HANA_	DS_HANA_
Drill	DRILL_	LS_DRILL_	DS_DRILL_
Google BigQuery	GBQ_	LS_GBQ_	DS_GBQ_
Greenplum	GRPL_	LS_GRPL_	DS_GRPL_
HBase	HBAS_	LS_HBAS_	DS_HBAS_
Hive	HIVE_	LS_HIVE_	DS_HIVE_
Apache Impala	IMPA_	LS_IMPA_	DS_IMPA_
Informix	INMI_	LS_INMI_	DS_INMI_
MariaDB	MDB_	LS_MDB_	DS_MDB_
Microsoft Access	MACS_	LS_MACS_	DS_MACS_
Netezza	NETZ_	LS_NETZ_	DS_NETZ_
Phoenix	PHNX_	LS_PHNX_	DS_PHNX_
Presto (Preview)	PRST_	LS_PRST_	DS_PRST_
Spark	SPRK_	LS_SPRK_	DS_SPRK_
Vertica	VERT_	LS_VERT_	DS_VERT_
Snowflake	SNWF_	LS_SNWF_	DS_SNWF_
MongoDB Atlas	MONG_ATLAS_	LS_MONG_ATLAS_	DS_MONG_ATLAS_
Amazon RDS for Oracle	RDSORAC_	LS_RDSORAC_	DS_RDSORAC_
Amazon RDS for SQL Server	RDSSQL_	LS_RDSSQL_	DS_RDSSQL_

Files
	Abbreviation	Linked Service	Dataset
File System	FILE_	LS_FILE_	DS_FILE_
HDFS	HDFS_	LS_HDFS_	DS_HDFS_
Amazon S3	AMS3_	LS_AMS3_	DS_AMS3_
FTP	FTP_	LS_FTP_	DS_FTP_
SFTP	SFTP_	LS_SFTP_	DS_SFTP_
Google Cloud Storage	GCS_	LS_GCS_	DS_GCS_
Oracle Cloud Storage	OCS_	LS_OCS_	DS_OCS_
Amazon S3 Compatible Storage	SMS3C_	LS_SMS3C_	DS_SMS3C_

Generic
	Abbreviation	Linked Service	Dataset
Generic ODBC	ODBC_	LS_ODBC_	DS_ODBC_
Generic OData	ODAT_	LS_ODAT_	DS_ODAT_
Generic REST	REST_	LS_REST_	DS_REST_
Generic HTTP	HTTP_	LS_HTTP_	DS_HTTP_

Services and Apps
	Abbreviation	Linked Service	Dataset
Salesforce	SAFC_	LS_SAFC_	DS_SAFC_
Salesforce Service Cloud	SAFCSC_	LS_SAFCSC_	DS_SAFCSC_
Salesforce Marketing Cloud	SAFOMC_	LS_SAFOMC_	DS_SAFOMC_
GitHub	GITH_	LS_GITH_	DS_GITH_
Jira	JIRA_	LS_JIRA_	DS_JIRA_
Web Table (table from HTML)	WEBT_	LS_WEBT_	DS_WEBT_
Amazon Marketplace Web Service	AMSMWS_	LS_AMSMWS_	DS_AMSMWS_
Xero	XERO_	LS_XERO_	DS_XERO_
SharePoint Online List	SHAREPOINT_	LS_SHAREPOINT_	DS_SHAREPOINT_
ServiceNow	SERVICENOW_	LS_SERVICENOW_	DS_SERVICENOW_
Dynamics (Microsoft Dataverse)	DATAVERSE_	LS_DATAVERSE_	DS_DATAVERSE__
Dynamics 365	D365_	LS_D365_	DS_D365_
Dynamics AX	DAX_	LS_DAX_	DS_DAX_
Dynamics CRM	DCRM_	LS_DCRM_	DS_DCRM_
Microsoft 365	M365_	LS_M365_	Ds_M365__
SAP Cloud for Customer (C4C)	SAPC4C_	LS_SAPC4C_	DS_LS_SAPC4C_
SAP ECC	SAPE_	LS_SAPE_	DS_SAPE_

If your connector is not described(mostly connectors which are in Preview), please let me know. For more details for all the different connectors, check the connector overview

Pipeline

Even for Pipeline you can define naming conventions. I think the most important thing is that you always start your pipeline with PL_ followed by a Logic Name for you. You can for example use:

TRANS: Pipeline with transformations

SSIS: Pipeline with SSIS Packages

DATA: Pipeline with DataMovements

COPY: Pipeline with Copy Activities

Divers

NB: Notebook

DF: Mapping Dataflows

SQL: SQL Scripts

KQL: KQL Scripts

JOB: Spark job definition

Once again these naming conventions are just suggestions. The most important thing is that you start using naming conventions and that you use the folder structure within the Pipelines (categories). Like the picture below as an example.

If you have suggestions just let me know by leaving a comment below.

Azure Data Factory: New functionalities and features

by Erwin | May 22, 2020 | Azure Data Factory

Azure Synapse

by Erwin | May 22, 2020

New functionalities and features

Last week, a number of great new functionalities and features were added within Azure Data Factory. I would like to take you in some details in the blog below:

Customer key

With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key (BYOK) you can add extra security to your Azure Data Factory environment. If you use the customer-managed key functionality, the data will be encrypted in combination with the ADF system key. You can create your own key or have it generated by the Azure Key Vault API.

You can read more in this Article which I wrote.

Pipeline Consumption Report

Last week the Azure Data Factory added the Pipeline Consumption Report.

The report can be used for your Triggered runs, just go to your Triggered runs and click on the new Icon.

The consumption of the selected Pipeline will be displayed. The data shown is only from this Pipeline and not from other Pipelines fired by this Pipeline. Would be a nice addition if the report shows the aggregation of the complete Triggered Run.

For your debug run, click on right site of your Output pane:

The ADF consumption report is only surfacing Azure Data Factory related units. There may be additional units billed from other services that you are using and accessing which are not accounted for here including Azure SQL Database, Synapse Analytics, CosmosDB, ADLS, etc. More detailed can be found here.

Parameters from Execute Pipeline Activity

When calling a Pipeline you first had to add the parameters yourself, now they are automatically taken over from the Pipeline you select. Very handy and saves time again if you use a lot of parameters.

Define a Parameter in one of your Pipelines:

Create another Pipeline and add the Execute Pipeline activity. On the settings tab where you have to select the Pipeline you want to execute, you will discover that the option to add Manually the parameters is not there anymore. But, all the Parameters you had defined in your Pipeline are directly shown. Very handy and it reduces errors.

Old Situation:

New Situation:

General Tab moved to new Properties Pane

Your General tab is now moved to the right site of the Canvas.

To edit it your properties, click on the pane icon located in the top-right corner of the canvas.

So these were some nice and useful addition to Azure Data Factory. I am very happy with it and what do you think?

Latest Posts

Feel free to leave a comment

Azure Data Factory: How to assign a Customer Managed Key

by Erwin | May 21, 2020 | Azure Data Factory, Azure KeyVault

Customer key

With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key (BYOK) you can add extra security to your Azure Data Factory environment. If you use the customer-managed key functionality, the data will be encrypted in combination with the ADF system key. You can create your own key or have it generated by the Azure Key Vault API

Be careful, this new feature can only be enabled on an empty Azure Data Factory environment. Make sure your Azure Active Directory, Azure Data Factory and Azure KeyVault are all in the same region. If you use an Azure Landing Zone consisting of different subscriptions, this is also possible, as long as the services exist in the same region.

Please follow the steps below how to enable this new feature:

I assume that you already have an existing Azure KeyVault. If not, you will have to create one first. You can read how to do that here.
With an existing Azure KeyVault, it is important that you enable the options Soft Deletes and Purge protection.

Enable Soft Deletes and Purge protection

If you want to enable this via Powershell use the following command:

[code lang="ps"] ($resource = Get-AzResource -ResourceId (Get-AzKeyVault -VaultName 'YOURKEYVAULTNAME').ResourceId).Properties | Add-Member -MemberType 'NoteProperty' -Name 'enableSoftDelete' -Value 'true'

Set-AzResource -resourceid $resource.ResourceId -Properties $resource.Properties

($resource = Get-AzResource -ResourceId (Get-AzKeyVault -VaultName 'YOURKEYVAULTNAME').ResourceId).Properties | Add-Member -MemberType 'NoteProperty' -Name 'enablePurgeProtection' -Value 'true'

Set-AzResource -resourceid $resource.ResourceId -Properties $resource.Properties

Define Access policy

The next step is to enable your Grant Data Factory access to Azure Key Vault, you have to enable the following permissions: Get, Unwrap Key, and Wrap Key

Search for Data Factory Instance and Select the correct one:

Create KEY

Once you have done that it’s time to create your Keys. Keep in mind that only RSA 2048-bit keys are supported by Azure Data Factory encryption.

Very important step your key name must be in only letters. KEYADFNAMECUSTOMER will work, but KEY-ADFNAME-CUSTOMER isn’t and you will get an error in your Azure Data Factory Instance. It took me a while to figure this out. So it can saves you a lot of time.

After your KEY is created, copy the Key Identifier.

Assign Customer Key

The last step in this article is to assign the key to your Azure Data Factory Instance.

CMK

Paste the selected key in your Azure Data Factory Instance and save.

Errors

If your get an error “Invalid key Vault URL”

-Check if the Soft Deletes and Purge protection on your Key Vault is set.

-Check if your Key consists only of letters.

-Check if you enabled your Grant Data Factory access to Azure Key Vault.

-Check if Azure DataFactory, Azure KeyVault and your Azure Active Directory are in the same region.

If you still have errors, please send me a message and I will try to help you out.

Hopefully, this article has helped you to secure your environment.

ADF: Get Metadata Activity stopped working

by Erwin | Jul 1, 2019 | Azure, Azure Data Factory

Azure Synapse

by Erwin | Jul 1, 2019

Meta Data Activity

Today my pipelines in Azure Data Factory (ADF) suddenly stopped working. The output structure was not found. Quit strange while these pipelines have been running for weeks.

Invalid Template

After debugging my Pipeline I found out the output.structure couldn't be evaluated.

Field List

I don't know when this option was added, but if you open the Get Metadata Activity you will find under Dataset the option Field List

The only thing I needed to do get my previous output back was adding structure as an argument. The structure will take care of the Metadata of my table and will be used in the Copy Data Activity.

After rerunning my Pipeline the error was gone.

I changed all my pipelines and the processes where running fine again. I took me bit to find out what the exact error was.

Hopefully it will save you some time after reading this blog.

More about the Get Metadata Activity can be found here .

Feel free to leave a comment

Parameterize Linked Services in ADF

Azure Synapse

Parameterize Linked Services

For my Azure Data Factory solution I wanted to Parameterize properties in my Linked Services. Not all properties are Parameterized by default through the UI. But there's another way to achieve this.

Linked Service

Latest FMD Releases

Feel free to leave a comment

Azure Data Factory and Azure Synapse Analytics Naming Conventions

Azure

Database

Files

Generic

Services and Apps

Pipeline

Divers

Azure Data Factory: New functionalities and features

Azure Synapse

New functionalities and features

Last week, a number of great new functionalities and features were added within Azure Data Factory. I would like to take you in some details in the blog below:

Customer key

Pipeline Consumption Report

Parameters from Execute Pipeline Activity

General Tab moved to new Properties Pane

Latest Posts

Categories

Feel free to leave a comment

Azure Data Factory: How to assign a Customer Managed Key

Customer key

Enable Soft Deletes and Purge protection

Define Access policy

Create KEY

Assign Customer Key

ADF: Get Metadata Activity stopped working

Azure Synapse

Meta Data Activity

Today my pipelines in Azure Data Factory (ADF) suddenly stopped working. The output structure was not found. Quit strange while these pipelines have been running for weeks.

Invalid Template

Field List

Feel free to leave a comment

Categories