Comparative Analysis of Linked Services in Azure Data Factory and Azure Synapse Analytics

Azure Data Factory (ADF) and Azure Synapse Analytics (ASA) both utilize linked services to establish connections to external data sources, compute resources, and other Azure services. While they share similarities, there are key differences in their implementation, use cases, and integration capabilities.

When you create Linked Services, some of them are slimier. But have slightly different purposes and have some key differences. for example “Databricks” and “Databricks Delta Lake”, “REST” and “HTTP”. Here is side by side comparisons of difference.

“Linked Services to databricks” and “Linked Services to databricks delta lake”

Here’s a side-by-side comparison between Databricks and Databricks Delta Lake Linked Services in Azure Data Factory (ADF):

Key differences and when to use which:

  • Databricks Linked Service is for connecting to the compute environment (jobs, notebooks) of Databricks.
  • Databricks Delta Lake Linked Service is for connecting directly to Delta Lake data storage (tables/files).
FeatureDatabricks Linked ServiceDatabricks Delta Lake Linked Service
PurposeConnect to an Azure Databricks workspace to run jobs or notebooks.Connect to Delta Lake tables within Azure Databricks.
Primary Use CaseRun notebooks, Python/Scala/Spark scripts, and perform data processing tasks on Databricks.Read/write data from/to Delta Lake tables for data ingestion or extraction.
Connection TypeConnects to the compute environment of Databricks (notebooks, clusters, jobs).Connects to data stored in Delta Lake format (structured data files).
Data StorageNot focused on specific data formats; used for executing Databricks jobs.Specifically used for interacting with Delta Lake tables (backed by Parquet files).
ACID TransactionsDoes not inherently support ACID transactions (although Databricks jobs can handle them in notebooks).Delta Lake supports ACID transactions (insert, update, delete) natively.
Common Activities– Running Databricks notebooks.
– Submitting Spark jobs.
– Data transformation using PySpark, Scala, etc.
– Reading from or writing to Delta Lake.
– Ingesting or querying large datasets with Delta Lake’s ACID support.
Input/OutputInput/output via Databricks notebooks, clusters, or jobs.Input/output via Delta Lake tables/files (with versioning and schema enforcement).
Data ProcessingFocus on data processing (ETL/ELT) using Databricks compute power.Focus on data management within Delta Lake storage layer, including handling updates and deletes.
When to Use– When you need to orchestrate and run Databricks jobs for data processing.– When you need to read or write data specifically stored in Delta Lake.
– When managing big data with ACID properties.
Integration in ADF PipelinesExecute Databricks notebook activities or custom scripts in ADF pipelines.Access Delta Lake as a data source/destination in ADF pipelines.
Supported FormatsAny format depending on the jobs or scripts running in Databricks.Primarily deals with Delta Lake format (which is based on Parquet).

REST Linked Service and HTTP Linked Service

In Azure Data Factory (ADF), both the REST and HTTP linked services are used to connect to external services, but they serve different purposes and have distinct configurations.

When to use which?

  • REST Linked Service: Use it when working with APIs that require advanced authentication, return paginated JSON data, or have dynamic query/header needs.
  • HTTP Linked Service: Use it for simpler tasks like downloading files from a public or basic-authenticated HTTP server.
FeatureREST Linked ServiceHTTP Linked Service
PurposeInteract with RESTful APIsGeneral-purpose HTTP access
Authentication MethodsAAD, Service Principal, etc.Basic, Anonymous
Pagination SupportYesNo
Dynamic Headers/ParamsYesLimited
File AccessNoYes
Data FormatJSONFile or raw data

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)