Azure Data Factory and Synapse Analytics

Azure Data Factory and Synapse Analytics

Create External Data Sources in Synapse Serverless SQL

Step by step using Synapse Studio UI or Script to create External Data Source more …

ADF or ASA lookup filter Modified date query for SharePoint Online List

This article is focused on ADF or ASA lookup activity filter modified date, type, is Current version or not etc. query for SharePoint Online List. more …

Metadata driven full solution for Azure Data Factory or Synapse incrementally copy data from SharePoint Online sink to ADSL Gen2

Provide a fully Metadata driven solution about using Azure Data Factory (ADF) or Synapse Analytics (ASA) incrementally copy multiple data sets one time from SharePoint Online (SPO) then sink them to ADSL Gen2. more …

ADF activities failure vs pipeline failure and pipeline error handling logical mechanism

This article talks when activities failed on a pipeline, the pipeline status will be, failure or success, and error handing logical mechanism. more ...

Comparing the use of wildcards in the Copy Activity of Azure Data Factory with the Get Metadata activity for managing multiple file copies

Both the Copy Activity with wildcards (*.*) and the Get Metadata activity for retrieving a file list serve to work with multiple files copy or move. more …

ADF or ASA lookup Delta Table in Databricks Unity Catalog

Using ADF to look up data from a Delta table in Databricks’ Unity Catalog involves several steps more …

Comparative Analysis of Linked Services in Azure Data Factory and Azure Synapse Analytics

In ADF or AASA, when you create Linked Services, some of them are slimier. But have slightly different purposes and have some key differences. e.g.
“REST Linked Services” & “HTTP Linked Services”
“Linked Services to databricks” & “Linked Services to databricks delta lake

more …

Azure Data Factory or Synapse Copy Activity with File System

In Azure Data Factory (ADF) or Synapse, using Copy Activity with a File System as a source or sink is common when dealing with on-premises more …

Get Metadata activity in ADF or ASA

The Get Metadata activity in Azure Data Factory (ADF) or Azure Synapse Analytics (ASA) is used to retrieve metadata about a file, folder, or database. This activity is particularly useful when you need to dynamically determine properties like file name, size, structure, or existence and use them in subsequent pipeline activities.
more …

Building Slowly Changing Dimensions Type 2 in Azure Data Factory and Synapse

Within the context of enterprise data warehousing, the effective management of historical data is essential for supporting informed business decision-making. Slowly Changing Dimension (SCD) Type 2 is a widely adopted technique for addressing changes in data over time.
more …

Dynamic ETL Mapping in Azure Data Factory/Synapse Analytics: Source-to-Target Case Study Implementation (1)

Scenario: the source schema may have new or missing columns, or the destination schema may have columns with different names or might lack columns to accommodate new incoming source fields.

  • Source: CSV, Schema varies between executions (columns may be missing, reordered, or new).
  • Destination: Database, SQL DB,
  • Requirement: Dynamically handle variations in the source data, map data to a consistent destination schema, and gracefully manage missing columns by assigning default values to them. Add new columns to the target database table if they appear in the incoming source data.
  • Problem: emp_ID and dep_id missed from source data;
    schema name are not exactly same, name <–> emp_Name;
    Target DB table does not have the column “state”

more …

Change Data Capture with Azure Data Factory and Synapse Analytics

ADF new top-level function Chang Data Capture (CDC) is the most effective way to read the source data that has changed since the last time the pipeline ran, rather than always querying an entire dataset on each run.

more …

Data Flow

Azure Data Factory (ADF) Mapping Data Flow offers a variety of transformations to design and implement complex data transformation workflows.

Visual Summary of Categories

CategoryTransformations
Schema ModifiersSelect, Alter Row, Derived Column, Pivot, Unpivot, Surrogate Key
Row ModifiersFilter, Conditional Split, Sort, Window
JoinsJoin, Lookup
AggregationsAggregate
Data QualityAssert, Exists
Set OperationsUnion, Intersect, Except
Flow ControlsCache, Alter Row
Source and SinkSource, Sink
Data ProcessingFlatten, Cross Join
Aggregate Transformation

The Aggregate transformation in Azure Data Factory (ADF) Data Flows is a powerful tool for performing calculations on groups of data. It’s analogous to the GROUP BY clause in SQL, allowing you to summarize data based on one or more grouping columns.

more …

Alter Row Transformation

The Alter Row Transformation is used to set row-level policies for data being written to a sink. This transformation is particularly useful when you are working with slowly changing dimensions (SCD) or when you need to synchronize data between source and sink systems.

more …

Using Exists Transformation for Data Comparison in Azure Data Factory/Synapse

The Exists transformation is a row filtering transformation that checks whether your data exists in another source or stream. The output stream includes all rows in the left stream that either exist or don’t exist in the right stream.

more …