ERWIN & BUSINESS ANALYTICS

Azure Data Factory Naming Conventions

by Apr 4, 2019

Naming ConventionsAzure Naming Conventions

More and more projects are using Azure Data Factory and Azure Synapse Analytics, the more important it is to apply a correct and standard naming convention. When using standard naming conventions you create recognizable results across different projects, but you also create clarity for your colleagues. In addition to that, it is easier to add these projects to other services such as Managed Services, Azure DevOps,  etc etc,  because standards are used.

To start with these naming conventions, I have made a list of suggestions with most common Linked Services. The list is not exhaustive, but it does provide guidance for new Linked Services.

There are a few standard naming conventions that apply to all elements in Azure Data Factory and in Azure Synapse Analytics.

* Names are case insensitive (not case sensitive).  For that reason I’m only using CAPITALS.

* Maximum number of characters in a table name: 260.

* All object names must begin with a letter, number or underscore (_).

* Following characters are not allowed: “.”, “+”, “?”, “/”, “<”, ”>”,”*”,”%”,”&”,”:”,”\”

These rules are also defined on the following link

Azure

     
  Abbreviation Linked Service Dataset
Azure Blob Storage ABLB_ LS_ABLB_ DS_ABLB_
Azure Cosmos DB SQL API ACSA_ LS_ACSA_ DS_ACSA_
Azure Cosmos DB MongDB API ACMA_ LS_ACMA_ DS_ACMA_
Azure Data Explorer ADEX_ LS_ADEX_ DS_ADEX_
Azure Data Lake Storage Gen1 ADLS_ LS_ADLS_ DS_ADLS_
Azure Data Lake Storage Gen2 ADLS_ LS_ADLS_ DS_ADLS_
Azure Database for MariaDB AMDB_ LS_AMDB_ DS_AMDB_
Azure Database for MySQL AMYS_ LS_AMYS_ DS_AMYS_
Azure Database for PostgreSQL APOS_ LS_APOS_ DS_APOS_
Azure File Storage AFIL_ LS_AFIL_ DS_AFIL_
Azure Search ASER_ LS_ASER_ DS_ASER_
Azure SQL Database ASQL_ LS_ASQL_ DS_ASQL_
Azure SQL Database Managed Instance ASQM_ LS_ASQM_ DS_ASQM_
Azure Synapse Analytics (formerly Azure SQL DW) ASDW_ LS_ASDW_ DS_ASDW_
Azure Table Storage ATBL_ LS_ATBL_ DS_ATBL_
Azure DataBricks ADBR_ LS_ADBR_ DS_ADBR_
Azure Cognitive Search ACGS_ LS_ACGS DS_ACGS_
       

Database

     
  Abbreviation Linked Service Dataset
SQL Server  MSQL_ LS_SQL_ DS_SQL_
Oracle ORAC_ LS_ORAC_ DS_ORAC_
Oracle Eloqua ORAE_ LS_ORAE_ DS_ORAE_
Oracle Responsys ORAR_ LS_ORAR_ DS_ORAR_
Oracle Service Cloud ORSC_ LS_ORSC_ DS_ORSC_
MySQL MYSQ_ LS_MYSQ_ DS_MYSQ_
DB2 DB2_ LS_DB2_ DS_DB2_
Teradata  TDAT_ LS_TDAT_ DS_TDAT_
PostgreSQL POST_ LS_POST_ DS_POST_
Sybase SYBA_ LS_SYBA_ DS_SYBA_
Cassandra CASS_ LS_CASS_ DS_CASS_
MongoDB MONG_ LS_MONG_ DS_MONG_
Amazon Redshift ARED_ LS_ARED_ DS_ARED_
SAP Business Warehouse SAPW_ LS_SAPW_ DS_SAPW_
SAP ECC SAPE_ LS_SAPE_ DS_SAPE_
SAP Cloud for Customer (C4C) SAPC_ LS_SAPC_ DS_SAPC_
SAP Table SAPT_ LS_SAPT DS_SAPT_
SAP HANA HANA_ LS_HANA_ DS_HANA_
Drill DRILL_ LS_DRILL_ DS_DRILL_
Google BigQuery GBQ_ LS_GBQ_ DS_GBQ_
Greenplum GRPL_ LS_GRPL_ DS_GRPL_
HBase HBAS_ LS_HBAS_ DS_HBAS_
Hive HIVE_ LS_HIVE_ DS_HIVE_
Apache Impala IMPA_ LS_IMPA_ DS_IMPA_
Informix INMI_ LS_INMI_ DS_INMI_
MariaDB MDB_ LS_MDB_ DS_MDB_
Microsoft Access MACS_ LS_MACS_ DS_MACS_
Netezza NETZ_ LS_NETZ_ DS_NETZ_
Phoenix PHNX_ LS_PHNX_ DS_PHNX_
Presto (Preview) PRST_ LS_PRST_ DS_PRST_
Spark SPRK_ LS_SPRK_ DS_SPRK_
Vertica VERT_ LS_VERT_ DS_VERT_
Snowflake SNWF_ LS_SNWF_ DS_SNWF_
       

Files

     
  Abbreviation Linked Service Dataset
File System FILE_ LS_FILE_ DS_FILE_
HDFS HDFS_ LS_HDFS_ DS_HDFS_
Amazon S3  AMS3_ LS_AMS3_ DS_AMS3_
FTP FTP_ LS_FTP_ DS_FTP_
SFTP SFTP_ LS_SFTP_ DS_SFTP_
Google Cloud Storage GCS_ LS_GCS_ DS_GCS_
       

Divers

     
  Abbreviation Linked Service Dataset
Salesforce SAFC_ LS_SAFC_ DS_SAFC_
Generic ODBC ODBC_ LS_ODBC_ DS_ODBC_
Generic OData  ODAT_ LS_ODAT_ DS_ODAT_
Web Table (table from HTML)  WEBT_ LS_WEBT_ DS_WEBT_
REST REST_ LS_REST_ DS_REST_
HTTP HTTP_ LS_HTTP_ DS_HTTP_

 

Pipeline

Even for Pipeline you can define naming conventions. I think the most important thing is that you always start your pipeline with PL_ followed by a Logic Name for you. You can for example use:

TRANS: Pipeline with transformations

SSIS: Pipeline with SSIS Packages

DATA: Pipeline with DataMovements

COPY: Pipeline with Copy Activities

 

Once again these naming conventions are just suggestions. The most important thing is that you start using naming conventions and that you use the folder structure within the Pipelines (categories).

If you have suggestions just let me know by leaving a comment below.

Feel free to leave a comment

9 Comments

  1. Robbin h

    And how to define triggers ;)?

    Reply
    • Erwin

      HI Robbin, my triggers are always named like Daily/Weekly/Monthly followed by the name of the Pipeline. This way you keep it clear. I never link multiple Pipelines to the same Trigger.

      Reply
  2. Koen

    I’m wondering, Erwin, why do you use abbreviations?

    I’ve found that modern systems don’t really have a limit on number of characters. In Ye Olden Days those limits enforced abbreviations, modern systems don’t have this limitation.
    I’ve also found that the context-sensitivity of abbreviations means they make reading and interpreting the names more difficult.

    Wouldn’t it make sense, in the interest of legibility, to use the full name of just about everything?

    What is the advantage of
    LS_ABLB
    over
    LinkedService_AzureBlobStorage
    ?

    I can see a point in abbreviating LinkedService to LS; I mean; it should be clear from the context that this is a linked service. But ABLB, to me, is a lot harder to read and interpret than AzureBlobStorage. The result is that doing maintenance will be more difficult on the former than on the latter.

    Reply
    • Erwin

      Dear Koen,

      These naming conventions are more of a guideline. I use them in this way to at least ensure that all LinkedServices / DataSets / Pipelines are built in a consistent way, but also that you can validate them with Test scripts in Azure Dev Ops.
      The most important thing about this blog is that you do it the same everywhere, if you use a different name for this, that is of course no problem. Most abbreviations are easily translatable to the correct Azure Service, especially if you work with it on a daily basis.

      Reply
  3. Manuel

    Excellent article!
    How do you suggest define folder structures with pipelines? I think two general areas: Staging and DWH and categorized them by project/DDBB..¿?

    Reply
    • Erwin

      Happy to hear that Manual. You suggestion is a good one.

      Currently I’m using, And then you can still create a new sub for project. But probably you have also shared recourses across you different projects, so then my advice is then to add a Shared Project folder

      01.Datalake
      01.Control
      02.Command
      03.Execute
      02.Deltalake
      01.Control
      02.Command
      03.Execute
      03.DataStore
      01.Control
      02.Command
      03.Execute
      Work In Progress

      Let me know you findings

      Reply
  4. Hennie

    Very well written Erwin! Keep m coming. I miss snowflake 😉 LS_SNOW_?

    Reply
    • Erwin

      Good one Hennie, I need to take some to review all the new sources

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

two × five =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

9 Comments

  1. Robbin h

    And how to define triggers ;)?

    Reply
    • Erwin

      HI Robbin, my triggers are always named like Daily/Weekly/Monthly followed by the name of the Pipeline. This way you keep it clear. I never link multiple Pipelines to the same Trigger.

      Reply
  2. Koen

    I’m wondering, Erwin, why do you use abbreviations?

    I’ve found that modern systems don’t really have a limit on number of characters. In Ye Olden Days those limits enforced abbreviations, modern systems don’t have this limitation.
    I’ve also found that the context-sensitivity of abbreviations means they make reading and interpreting the names more difficult.

    Wouldn’t it make sense, in the interest of legibility, to use the full name of just about everything?

    What is the advantage of
    LS_ABLB
    over
    LinkedService_AzureBlobStorage
    ?

    I can see a point in abbreviating LinkedService to LS; I mean; it should be clear from the context that this is a linked service. But ABLB, to me, is a lot harder to read and interpret than AzureBlobStorage. The result is that doing maintenance will be more difficult on the former than on the latter.

    Reply
    • Erwin

      Dear Koen,

      These naming conventions are more of a guideline. I use them in this way to at least ensure that all LinkedServices / DataSets / Pipelines are built in a consistent way, but also that you can validate them with Test scripts in Azure Dev Ops.
      The most important thing about this blog is that you do it the same everywhere, if you use a different name for this, that is of course no problem. Most abbreviations are easily translatable to the correct Azure Service, especially if you work with it on a daily basis.

      Reply
  3. Manuel

    Excellent article!
    How do you suggest define folder structures with pipelines? I think two general areas: Staging and DWH and categorized them by project/DDBB..¿?

    Reply
    • Erwin

      Happy to hear that Manual. You suggestion is a good one.

      Currently I’m using, And then you can still create a new sub for project. But probably you have also shared recourses across you different projects, so then my advice is then to add a Shared Project folder

      01.Datalake
      01.Control
      02.Command
      03.Execute
      02.Deltalake
      01.Control
      02.Command
      03.Execute
      03.DataStore
      01.Control
      02.Command
      03.Execute
      Work In Progress

      Let me know you findings

      Reply
  4. Hennie

    Very well written Erwin! Keep m coming. I miss snowflake 😉 LS_SNOW_?

    Reply
    • Erwin

      Good one Hennie, I need to take some to review all the new sources

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

eighteen + 7 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create an Azure Synapse Analytics SQL Pool

Adding a new SQL Pool There are 2 options to create a SQL Pool. Go to your Synapse Workspace in de Azure Portal and add a new SQL Pool. Or go to the Management Tab in your Azure Synapse Workspace and add a new Pool. Creating a new SQL Pool SQL Pool Name (SQL pool name...

Azure SQL Data Warehouse: Reserved Capacity versus Pay as You go

How do I use my Reserved Capacity correctly? Update 11-11-2020: This also applies to Azure Synapse SQL Pools. In my previous article you were introduced, how to create a Reserved Capacity for an Azure SQL Datawarehouse (SQLDW). Now it's time to take a look at how this...

Change your Action Group in Azure Monitoring

Change a Action GroupPrevious Article In my previous artcile I wrote about how to create Service Helath Alerts. In this article you will learn how to change the Action Group to add, change or Remove members(Action Group Type Email/SMS/Push/Voice) Azure Portal In the...

Scale your SQL Pool dynamically in Azure Synapse

Scale your Dedicated SQL Pool in Azure Synapse Analytics In my previous article, I explained how you can Pause and Resume your Dedicated SQL Pool with a Pipeline in Azure Synapse Analytics. In this article I will explain how to scale up and down a SQL Pool via a...

Azure DevOps and Azure Feature Pack for Integration Services

Azure Feature Pack for Integration ServicesAzure Blob Storage A great addition for SSIS is using extra connectors like  Azure Blob Storage or Azure Data Lake Store which are added by the Azure Feature Pack. This Pack needs to be installed on your local machine. Are...

Azure Synapse Analytics Power BI Integration

Creating a Linked Service for Power BI Open your Synapse Studio and select the Management Hub. Add a new Linked Service If you haven't connect to Power BI before, you will see the screen above. If you want to add another Power BI Linked Service(Workspace). Search for...

Azure Data Factory: How to assign a Customer Managed Key

Customer key With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key...

Azure Data Factory Let’s get started

Creating an Azure Data Factory Instance, let's get started Many blogs nowadays are about which functionalities we can use within Azure Data Factory. But how do we create an Azure Data Factory instance in Azure for the first time and what should you take into account? ...

How to setup Code Repository in Azure Data Factory

Why activate a Git Configuration? The main reasons are: Source Control: Ensures that all your changes are saved and traceable, but also that you can easily go back to a previous version in case of a bug. Continuous Integration and Continuous Delivery (CI/CD): Allows...

Azure Synapse Pause and Resume SQL Pool

Pause or Resume your Dedicated SQL Pool in Azure Synapse Analytics Azure Synapse Analytics went GA in beginning of December 2020, with Azure Synapse we can now also create a Dedicated SQL Pool(formerly Azure SQL DW). Please read this document to learn what a Dedicated...