Azure Data Factory (ADF) and Azure Synapse Analytics (ASA) both utilize linked services to establish connections to external data sources, compute resources, and other Azure services. While they share similarities, there are key differences in their implementation, use cases, and integration capabilities.
When you create Linked Services, some of them are slimier. But have slightly different purposes and have some key differences. for example “Databricks” and “Databricks Delta Lake”, “REST” and “HTTP”. Here is side by side comparisons of difference.
“Linked Services to databricks” and “Linked Services to databricks delta lake”
Here’s a side-by-side comparison between Databricks and Databricks Delta Lake Linked Services in Azure Data Factory (ADF):
Key differences and when to use which:
- Databricks Linked Service is for connecting to the compute environment (jobs, notebooks) of Databricks.
- Databricks Delta Lake Linked Service is for connecting directly to Delta Lake data storage (tables/files).
Feature | Databricks Linked Service | Databricks Delta Lake Linked Service |
Purpose | Connect to an Azure Databricks workspace to run jobs or notebooks. | Connect to Delta Lake tables within Azure Databricks. |
Primary Use Case | Run notebooks, Python/Scala/Spark scripts, and perform data processing tasks on Databricks. | Read/write data from/to Delta Lake tables for data ingestion or extraction. |
Connection Type | Connects to the compute environment of Databricks (notebooks, clusters, jobs). | Connects to data stored in Delta Lake format (structured data files). |
Data Storage | Not focused on specific data formats; used for executing Databricks jobs. | Specifically used for interacting with Delta Lake tables (backed by Parquet files). |
ACID Transactions | Does not inherently support ACID transactions (although Databricks jobs can handle them in notebooks). | Delta Lake supports ACID transactions (insert, update, delete) natively. |
Common Activities | – Running Databricks notebooks. – Submitting Spark jobs. – Data transformation using PySpark, Scala, etc. | – Reading from or writing to Delta Lake. – Ingesting or querying large datasets with Delta Lake’s ACID support. |
Input/Output | Input/output via Databricks notebooks, clusters, or jobs. | Input/output via Delta Lake tables/files (with versioning and schema enforcement). |
Data Processing | Focus on data processing (ETL/ELT) using Databricks compute power. | Focus on data management within Delta Lake storage layer, including handling updates and deletes. |
When to Use | – When you need to orchestrate and run Databricks jobs for data processing. | – When you need to read or write data specifically stored in Delta Lake. – When managing big data with ACID properties. |
Integration in ADF Pipelines | Execute Databricks notebook activities or custom scripts in ADF pipelines. | Access Delta Lake as a data source/destination in ADF pipelines. |
Supported Formats | Any format depending on the jobs or scripts running in Databricks. | Primarily deals with Delta Lake format (which is based on Parquet). |
REST Linked Service and HTTP Linked Service
In Azure Data Factory (ADF), both the REST and HTTP linked services are used to connect to external services, but they serve different purposes and have distinct configurations.
When to use which?
- REST Linked Service: Use it when working with APIs that require advanced authentication, return paginated JSON data, or have dynamic query/header needs.
- HTTP Linked Service: Use it for simpler tasks like downloading files from a public or basic-authenticated HTTP server.
Feature | REST Linked Service | HTTP Linked Service |
---|---|---|
Purpose | Interact with RESTful APIs | General-purpose HTTP access |
Authentication Methods | AAD, Service Principal, etc. | Basic, Anonymous |
Pagination Support | Yes | No |
Dynamic Headers/Params | Yes | Limited |
File Access | No | Yes |
Data Format | JSON | File or raw data |
Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca
(remove all space from the email account 😊)