

This article is part of a series dedicated to dynamic ETL Source-to-Target Mapping (STM) solutions, covering both batch and near real-time use cases. The series will explore various mapping scenarios, including one-to-many, many-to-one, and many-to-many relationships, with implementations provided throughout.
You need create and alter metadata table privilege.
Scenario
In this article, I will focus on scenario where the source schema may have new or missing columns, or the destination schema may have columns with different names or might lack columns to accommodate new incoming source fields.
Requirement:
Dynamically handle source variations, map data to the consistent destination schema, and handle missing columns gracefully. Giving default value to missed column, add new column to target DB table if they are new coming.
Source:
CSV, Schema varies between executions (columns may be missing, reordered, or new).
current source columns’ name: name, age, gender and state
Destination:
Database, SQL DB,
columns’ name: emp_ID , emp_Name, emp_age, gender, dep_id.
Problem:
- emp_ID and dep_id missed from source data.
- schema name are not exactly same
name <—-> emp_Name
age <—-> emp_age - target DB table does not have the column “state”
Source data:
name age gander state
Bill 32 M NY
Mary 34 F CA
Tom 23 M FL
Jim 26 M CA
Afton_Taborek 45 F FL
Amélie_Gilker 43 M NY
target SQLDB Table: emp
CREATE TABLE [dbo].[emp](
[emp_id] [nvarchar](50) NULL,
[emp_name] [nvarchar](50) NULL,
[emp_age] [nvarchar](50) NULL,
[gender] [nvarchar](50) NULL,
[dep_id] [nvarchar](50) NULL,
)
col_name data_type allow_null
emp_id nvarchar(50) null
emp_name nvarchar(50) null
emp_age nvarchar(50) null
gender nvarchar(50) null
dep_id nvarchar(50) null
Key components and steps of solution.
- Create metadata to hold STM plan
- Get metadata activity retrieves source data schema – columns’ name, data type
- Reset fields active status to False.
- ForEach coming source fields in metadata table, activities/field mapping/target column
- Retrieving each STM mapping plan from metadata table generate complete mapping plan
- Copy activity applying the mapping plan to “Dynamic Content of Mapping”
Solution

Step 1: Create a metadata table to hold mapping plan
CREATE TABLE [dbo].[metadata](
[source_filename] [varchar](max) NULL,
[src_col] [varchar](50) NULL,
[src_dataType] [varchar](50) NULL,
[src_col_createdate] [datetime] DEFAULT getdate() NULL,
[src_col_activity] [bit] NULL,
[destination_schema] [varchar](50) NULL,
[destination_table] [varchar](50) NULL,
[dst_col] [varchar](50) NULL,
[dst_dataType] [varchar](50) NULL,
[dst_createdate] [datetime] DEFAULT getdate() NULL,
[dst_col_activity] [bit] NULL,
[mapping] [varchar](max) NULL
)
[mapping] column will be a json style string, that indicate this source column will map to target column’s name. Its pattern is :
{
"source": {
"name": "Field/column name",
"type": "column generalized dataType",
"physicalType":"coming column's Native dataType"
},
"sink": {
"name": "Table column name",
"type": "coming column's generalized dataType",
"physicalType":"column's target database Native dataType"
}
}
“name”
Field’s name in the file, or column’s name in t DB table.
“type”
Logical Data Type. The abstract or generalized type used by Azure Data Factory (ADF) to interpret data regardless of the underlying system or format.
For example, string, integer, double etc.
“physicalType”
The specific type defined by the database or file system where the data resides.
For example, VARCHAR, NVARCHAR, CHAR, INT, FLOAT, NUMERIC(18,10), TEXT etc. in database
Each column has this source-to-sink mapping plan, we will concat all column’s mapping plan, generate a complete Source to Target mapping (STM) plan.
Step 2: Creating known field-column mapping plan
For each known field or column, create a Source-to-Target mapping plan, save it in the “mapping” column of the database metadata table, formatted in JSON style string.
# id field mapping plan
{
"source": {
"name": "id",
"type": "string",
"physicalType":"string"
},
"sink": {
"name": "emp_id",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}
# name field mapping plan
{
"source": {
"name": "name",
"type": "string",
"physicalType":"string"
},
"sink": {
"name": "emp_name",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}
# age field mapping plan
{
"source": {
"name": "age",
"type": "string",
"physicalType":"string"
},
"sink": {
"name": "emp_age",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}
# gander field mapping plan
{
"source": {
"name": "gander",
"type": "string",
"physicalType":"string"
},
"sink": {
"name": "gender",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}
# "dep_id"field mapping plan
{
"source": {
"name": "dep_id",
"type": "string",
"physicalType":"string"
},
"sink": {
"name": "dep_id",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}

We will utilize the column mapping plans to generate a comprehensive “copy activity” mapping plan.
For any new or unknown fields that may arise, we will address them in subsequent steps.
Step 3: get source metadata
Create a pipeline.
l name it “pl_dynamic_source_to_target_mapping”
Create variables
- var_sourcename, string
- var_field_name, string
- var_field_type, string
- var_mapping_plan, string
Add a “Get metadata” activity and setup it.
We need field list:
- Item name,
- Item type,
- structure.

“it”get metadata” get the return
{
"itemName": "name.csv",
"itemType": "File",
"structure": [
{
"name": "name",
"type": "String"
},
{
"name": "age",
"type": "String"
},
{
"name": "gander",
"type": "String"
},
{
"name": "state",
"type": "String"
}
],
"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime (East US 2)",
"executionDuration": 1,
"durationInQueue": {
"integrationRuntimeQueue": 0
},
"billingReference": {
"activityType": "PipelineActivity",
"billableDuration": [
{
"meterType": "AzureIR",
"duration": 0.016666666666666666,
"unit": "Hours"
}
]
}
}
Step 4: Reset the activity status of all source fields in the metadata table to False
Save source data name
Since we will address the item’s metadata one field by one field later, saving source data name in variable is convenient.
add a “Set variable” to save source data name in variable – “var_sourcename”

Reset all source fields to False
Add a “lookup activity”, reset the activity status of all source fields in the metadata table to False.
lookup query:
UPDATE metadata SET
src_col_activity = 0
WHERE source_filename = '@{variables('var_sourcename')}';
SELECT 1;

This is one of the important steps. It allows us to focus on the incoming source fields. When we build the complete ETL Source-to-Target mapping plan, we will utilize these incoming fields.

Step 5: ForEach address source data fields
Add the ‘ForEach activity’ to the pipeline, using the ‘structure’ to address the source data fields one by one.

Save source data field name and data type
In the ForEach activity, add two “Set variable” to save source data field name and data type in variable .
ForEach’s @item().name —> var_field_name
ForEach’s @item().type —> var_field_type

Lookup source fields in metadata table
Continue in ForEach activity, add a “lookup activity”, create a dataset point to metadata table.

Lookup query:
IF NOT EXISTS (
SELECT src_col from metadata
WHERE
source_filename = '@{variables('var_sourcename')}'
AND src_col = '@{variables('var_field_name')}'
)
BEGIN
-- Alter target table schema
IF NOT EXISTS ( SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'emp' AND COLUMN_NAME = '@{variables('var_field_name')}' )
ALTER TABLE emp ADD @{item().name} NVARCHAR(max);
SELECT 'target altered'; -- return
-- insert field metadata and STM plan
INSERT INTO metadata
(source_filename
, src_col
, src_dataType
, src_col_activity
, destination_schema
, destination_table
, dst_col
, dst_dataType
, dst_col_activity
, mapping)
VALUES
(
'@{variables('var_sourcename')}'
, '@{variables('var_field_name')}'
, '@{variables('var_field_type')}'
, 1
, 'dbo'
, 'emp'
, '@{variables('var_field_name')}'
, 'NVARCHAR'
, 1
, '{
"source": {
"name": "@{variables('var_field_name')}",
"type": "@{variables('var_field_type')}",
"physicalType":"@{variables('var_field_type')}"
},
"sink": {
"name": "@{variables('var_field_name')}",
"type": "nvarchar(max)",
"physicalType":"nvarchar(max)"
}
}'
);
SELECT 'insert field metadata';-- return
END
ELSE
BEGIN
UPDATE metadata SET src_col_activity = 1
WHERE source_filename = '@{variables('var_sourcename')}'
AND src_col = '@{variables('var_field_name')}'
select 'this field actived'; -- return
END;
- Check if the current source field exists in the ‘metadata’ table.
If the field’s name is found, update its activity status to True as an existing field. If the field’s name is not present, it indicates a new field. Insert this new field into the metadata table and establish its mapping plan to specify its intended destination. - Check the target table
[emp]
to verify if the column exists. If the column is not present, alter the schema of the target table[emp]
to add a new column to the destination table.

the target table schema altered
new field, “state”, metadata inserted in to the metadata table

new field mapping plan
'{
"source": {
"name": "@{variables('var_field_name')}",
"type": "@{variables('var_field_type')}",
"physicalType":"@{variables('var_field_type')}"
},
"sink": {
"name": "@{variables('var_field_name')}",
"type": "@{variables('var_field_type')}",
"physicalType":"@{variables('var_field_type')}"
}
}'
Step 6: Generate the complete ETL mapping plan
Generate complete ETL mapping plan
Add a “Lookup activity” to generate complete ETL mapping plan, use metadata table dataset.
This ‘lookup activity’ queries all activity field mapping plans from the metadata table to generate a complete STM mapping plan.
Query:
select
concat(
'{"type": "TabularTranslator",
"mappings": ['
, string_agg(mapping,',')
,'],'
,'"typeConversion": true,"typeConversionSettings": {"allowDataTruncation": false, "treatBooleanAsNumber": false}'
) as stm
from metadata
where
[source_filename] = '@{variables('var_sourcename')}'
and [src_col_activity] = 1

Also add “Set variable” to save the STM to variable “var_mapping_plan”
@activity('lkp generate entire ETL mapping plan').output.firstRow.stm

Step 7: Copy source data to target
Having established the dynamic mapping plan, we are now prepared to ingest data from the source and deliver it to the target. All preceding steps were dedicated to the development of the ETL mapping plan.
Copy activity: Applying the STM mapping plan
Add a “Copy activity”, using Source and Sink dataset we built previous.
changing to “Mapping” tag, click “Add dynamic content”, write expression:
@json(variables('var_mapping_plan'))

All previous steps were dedicated to building the ETL mapping plan.
Done !!!

Afterword
This article focuses on explaining the underlying logic of dynamic source-to-target mapping through a step-by-step demonstration. To clearly illustrate the workflow and logic flow, four “Set Variable” activities and four pipeline variables are included. However, in a production environment, these are not required.
Having demonstrated dynamic source-to-target mapping with all necessary logic flow steps, this solution provides a foundation that can be extended to other scenarios, such as one-to-many, many-to-one, and many-to-many mappings. Implementations for these scenarios will be provided later.
Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca
(remove all space from the email account 😊)