ERWIN & BUSINESS ANALYTICS

Azure Data Factory Naming Conventions

Azure Naming ConventionsNaming Conventions

More and more projects are using Azure Data Factory, the more important it is to apply a correct naming convention. When using naming conventions you create recognizable results across different projects, but you also create clarity for your colleagues. In addition to that, it is easier to add these projects to other services such as Managed Services, Azure DevOps,  etc etc,  because standards are used.

To start with these naming conventions, I have made a list of suggestions with most common Linked Services. The list is not exhaustive, but it does provide guidance for new Linked Services.

There are a few standard naming conventions that apply to all elements in Azure Data Factory.

* Names are case insensitive (not case sensitive).  For that reason I’m only using CAPITALS.

* Maximum number of characters in a table name: 260.

* All object names must begin with a letter, number or underscore (_).

* Following characters are not allowed: “.”, “+”, “?”, “/”, “<”, ”>”,”*”,”%”,”&”,”:”,”\”

These rules are also defined on the following link

Azure

     
  Abbreviation Linked Service Dataset
Azure Blob Storage ABLB_ LS_ABLB_ DS_ABLB_
Azure Cosmos DB SQL API ACSA_ LS_ACSA_ DS_ACSA_
Azure Cosmos DB MongDB API ACMA_ LS_ACMA_ DS_ACMA_
Azure Data Explorer ADEX_ LS_ADEX_ DS_ADEX_
Azure Data Lake Storage Gen1 ADLS_ LS_ADLS_ DS_ADLS_
Azure Data Lake Storage Gen2 ADLS_ LS_ADLS_ DS_ADLS_
Azure Database for MariaDB AMDB_ LS_AMDB_ DS_AMDB_
Azure Database for MySQL AMYS_ LS_AMYS_ DS_AMYS_
Azure Database for PostgreSQL APOS_ LS_APOS_ DS_APOS_
Azure File Storage AFIL_ LS_AFIL_ DS_AFIL_
Azure Search ASER_ LS_ASER_ DS_ASER_
Azure SQL Database ASQL_ LS_ASQL_ DS_ASQL_
Azure SQL Database Managed Instance ASQM_ LS_ASQM_ DS_ASQM_
Azure Synapse Analytics (formerly Azure SQL DW) ASDW_ LS_ASDW_ DS_ASDW_
Azure Table Storage ATBL_ LS_ATBL_ DS_ATBL_
Azure DataBricks ADBR_ LS_ADBR_ DS_ADBR_
Azure Cognitive Search ACGS_ LS_ACGS DS_ACGS_
       

Database

     
  Abbreviation Linked Service Dataset
SQL Server  MSQL_ LS_SQL_ DS_SQL_
Oracle ORAC_ LS_ORAC_ DS_ORAC_
Oracle Eloqua ORAE_ LS_ORAE_ DS_ORAE_
Oracle Responsys ORAR_ LS_ORAR_ DS_ORAR_
Oracle Service Cloud ORSC_ LS_ORSC_ DS_ORSC_
MySQL MYSQ_ LS_MYSQ_ DS_MYSQ_
DB2 DB2_ LS_DB2_ DS_DB2_
Teradata  TDAT_ LS_TDAT_ DS_TDAT_
PostgreSQL POST_ LS_POST_ DS_POST_
Sybase SYBA_ LS_SYBA_ DS_SYBA_
Cassandra CASS_ LS_CASS_ DS_CASS_
MongoDB MONG_ LS_MONG_ DS_MONG_
Amazon Redshift ARED_ LS_ARED_ DS_ARED_
SAP Business Warehouse SAPW_ LS_SAPW_ DS_SAPW_
SAP ECC SAPE_ LS_SAPE_ DS_SAPE_
SAP Cloud for Customer (C4C) SAPC_ LS_SAPC_ DS_SAPC_
SAP Table SAPT_ LS_SAPT DS_SAPT_
SAP HANA HANA_ LS_HANA_ DS_HANA_
Drill DRILL_ LS_DRILL_ DS_DRILL_
Google BigQuery GBQ_ LS_GBQ_ DS_GBQ_
Greenplum GRPL_ LS_GRPL_ DS_GRPL_
HBase HBAS_ LS_HBAS_ DS_HBAS_
Hive HIVE_ LS_HIVE_ DS_HIVE_
Apache Impala IMPA_ LS_IMPA_ DS_IMPA_
Informix INMI_ LS_INMI_ DS_INMI_
MariaDB MDB_ LS_MDB_ DS_MDB_
Microsoft Access MACS_ LS_MACS_ DS_MACS_
Netezza NETZ_ LS_NETZ_ DS_NETZ_
Phoenix PHNX_ LS_PHNX_ DS_PHNX_
Presto (Preview) PRST_ LS_PRST_ DS_PRSt_
Spark SPRK_ LS_SPRK_ DS_SPRK_
Vertica VERT_ LS_VERT_ DS_VERT_
       

Files

     
  Abbreviation Linked Service Dataset
File System FILE_ LS_FILE_ DS_FILE_
HDFS HDFS_ LS_HDFS_ DS_HDFS_
Amazon S3  AMS3_ LS_AMS3_ DS_AMS3_
FTP FTP_ LS_FTP_ DS_FTP_
SFTP SFTP_ LS_SFTP_ DS_SFTP_
Google Cloud Storage GCS_ LS_GCS_ DS_GCS_
       

Divers

     
  Abbreviation Linked Service Dataset
Salesforce SAFC_ LS_SAFC_ DS_SAFC_
Generic ODBC ODBC_ LS_ODBC_ DS_ODBC_
Generic OData  ODAT_ LS_ODAT_ DS_ODAT_
Web Table (table from HTML)  WEBT_ LS_WEBT_ DS_WEBT_
REST REST_ LS_REST_ DS_REST_
HTTP HTTP_ LS_HTTP_ DS_HTTP_

 

Pipeline

Even for Pipeline you can define naming conventions. I think the most important thing is that you always start your pipeline with PL_ followed by a Logic Name for you. You can for example use:

TRANS: Pipeline with transformations

SSIS: Pipeline with SSIS Packages

DATA: Pipeline with DataMovements

COPY: Pipeline with Copy Activities

 

Once again these naming conventions are just suggestions. The most important thing is that you start using naming conventions and that you use the folder structure within the Pipelines (categories).

If you have suggestions just let me know by leaving a comment below.

Feel free to leave a comment

7 Comments

  1. Robbin h

    And how to define triggers ;)?

    Reply
    • Erwin

      HI Robbin, my triggers are always named like Daily/Weekly/Monthly followed by the name of the Pipeline. This way you keep it clear. I never link multiple Pipelines to the same Trigger.

      Reply
  2. Koen

    I’m wondering, Erwin, why do you use abbreviations?

    I’ve found that modern systems don’t really have a limit on number of characters. In Ye Olden Days those limits enforced abbreviations, modern systems don’t have this limitation.
    I’ve also found that the context-sensitivity of abbreviations means they make reading and interpreting the names more difficult.

    Wouldn’t it make sense, in the interest of legibility, to use the full name of just about everything?

    What is the advantage of
    LS_ABLB
    over
    LinkedService_AzureBlobStorage
    ?

    I can see a point in abbreviating LinkedService to LS; I mean; it should be clear from the context that this is a linked service. But ABLB, to me, is a lot harder to read and interpret than AzureBlobStorage. The result is that doing maintenance will be more difficult on the former than on the latter.

    Reply
    • Erwin

      Dear Koen,

      These naming conventions are more of a guideline. I use them in this way to at least ensure that all LinkedServices / DataSets / Pipelines are built in a consistent way, but also that you can validate them with Test scripts in Azure Dev Ops.
      The most important thing about this blog is that you do it the same everywhere, if you use a different name for this, that is of course no problem. Most abbreviations are easily translatable to the correct Azure Service, especially if you work with it on a daily basis.

      Reply
  3. Manuel

    Excellent article!
    How do you suggest define folder structures with pipelines? I think two general areas: Staging and DWH and categorized them by project/DDBB..¿?

    Reply
    • Erwin

      Happy to hear that Manual. You suggestion is a good one.

      Currently I’m using, And then you can still create a new sub for project. But probably you have also shared recourses across you different projects, so then my advice is then to add a Shared Project folder

      01.Datalake
      01.Control
      02.Command
      03.Execute
      02.Deltalake
      01.Control
      02.Command
      03.Execute
      03.DataStore
      01.Control
      02.Command
      03.Execute
      Work In Progress

      Let me know you findings

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

four − 1 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Azure DevOps and Azure Feature Pack for Integration Services

Azure Feature Pack for Integration ServicesAzure Blob Storage A great addition for SSIS is using extra connectors like  Azure Blob Storage or Azure Data Lake Store which are added by the Azure Feature Pack. This Pack needs to be installed on your local machine. Are...

SSMS 18.xx: Creating your Azure Data Factory SSIS IR directly in SSMS

Creating your Azure Data Factory(ADF) SSIS IR in SSMS Since  version 18.0 we could see our Integration Catalog on Azure Instances directly. Yesterday I wrote an article how to Schedule your SSIS Packages in ADF, during writing that article I found out that you can...

Azure Data Factory Naming Conventions

Naming Conventions More and more projects are using Azure Data Factory, the more important it is to apply a correct naming convention. When using naming conventions you create recognizable results across different projects, but you also create clarity for your...

Azure Data Factory: How to assign a Customer Managed Key

Customer key With this new functionality you can add extra security to your Azure Data Factory environment. Where the data was first encrypted with a randomly generated key from Microsoft, you can now use the customer-managed key feature. With this Bring Your Own Key...

How to setup Code Repository in Azure Data Factory

Why activate a Git Configuration? The main reasons are: Source Control: Ensures that all your changes are saved and traceable, but also that you can easily go back to a previous version in case of a bug. Continuous Integration and Continuous Delivery (CI/CD): Allows...

Azure SQL Data Warehouse: How to setup Reserved Capacity

Purchase your Azure SQL Datawarehouse Reservation   Since a few weeks you can buy Reserved Capacity for an Azure SQL Datawarehouse (SQLDW). This Reservation can save you up to 65% on the normal Pay as You go rates with a 3 year pre-commit. A pre-commit of 1 year...

Create an Azure Synapse Analytics SQL Pool

Adding a new SQL Pool There are 2 options to create a SQL Pool. Go to your Synapse Workspace in de Azure Portal and add a new SQL Pool. Or go to the Management Tab in your Azure Synapse Workspace and add a new Pool. Creating a new SQL Pool SQL Pool Name (SQL pool name...

Using Azure Automation to generate data in your WideWorldImporters database

CASE: For my test environment I want to load every day new increments into the WideWorldImporters Azure SQL Database with Azure Automation. The following Stored Procedure is available to achieve this. EXECUTE DataLoadSimulation.PopulateDataToCurrentDate...

Exploring Azure Synapse Analytics Studio

Azure Synapse Workspace Settings In my previous article, I walked you through "how to create your Azure Synapse Analytics Workspace". It's now time to explore the brand new Synapse Studio. Most configuration and settings can be done through the Synapse Studio. In your...

Azure Data Factory: Generate Pipeline from the new Template Gallery

Last week I mentioned that we could save a Pipeline to GIT. But today I found out that you can also create a Pipeline from a predefined Solution Template.Template Gallery These template will make it easier to start with Azure Data Factory and it will reduce...