infrastructure – Page 2

Configuring Azure Entra ID Authentication in Azure SQL Database

Azure SQL Database can be integrated with Azure Entra ID to provide identity and access management. With this integration, users can sign in to Azure SQL Database using their Azure Entra ID credentials, enabling a centralized and secure way to manage database access.

Register the SQL Server in Azure Entra ID

Enable Azure Entra ID Admin

Azure Portal > find out the SQL Server that you want to register with Azure Entra ID >

Settings > Microsoft Entra ID (Active Directory Admin)

Assign Users/Groups

You can assign Azure Entra ID users or groups to specific roles within the SQL Database, such as db_owner, db_datareader, or db_datawriter.

Then, Click Save to apply the changes.

Configure Azure Entra ID Authentication in Azure SQL Database

Connect to SQL Database using Azure Entra ID

You can connect to your Azure SQL Database using Azure Entra ID by selecting the “Azure Active Directory – Universal with MFA support” authentication method in tools like SQL Server Management Studio (SSMS).

Assign Roles to Azure Entra ID Users

Use a SQL query to assign roles to Azure Entra ID users or groups. For example:

CREATE USER [your_username@yourdomain.com] FROM EXTERNAL PROVIDER;
ALTER ROLE db_datareader ADD MEMBER [your_username@yourdomain.com];

This command creates an Azure Entra ID user in your SQL Database and adds them to the db_datareader role.

Set Up Role-Based Access Control (RBAC)

You can manage permissions through Azure Entra ID roles and assign these roles to your SQL Database resources.

Assign Roles via Azure Portal

Azure portal > your SQL Database > Access control (IAM) > Add role assignment.

Choose the appropriate role, such as “SQL DB Contributor“.

and assign it to the desired Azure Entra ID user or group

Considerations

No Password Management: Since authentication is managed via Azure Entra ID, there’s no need to manage passwords directly within the database.
Integration with Conditional Access: This allows you to enforce compliance requirements, such as requiring MFA or ensuring connections only come from specific locations.

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Using Key Vault services in Azure Ecosystem

Azure Key Vault safeguards encryption keys and secrets like certificates, connection strings, and passwords.

Key vaults define security boundaries for stored secrets. It allows you to securely store service or application credentials like passwords and access keys as secrets. All secrets in your key vault are encrypted with a software key. When you use Key Vault, you no longer need to store security information in your applications. Not having to store security information in applications eliminates the need to make this information part of the code.

What is a secret in Key Vault?

In Key Vault, a secret is a name-value pair of strings. Secret names must be 1-127 characters long, contain only alphanumeric characters and dashes, and must be unique within a vault. A secret value can be any UTF-8 string up to 25 KB in size.

Vault authentication and permissions

Developers usually only need Get and List permissions to a development-environment vault. Some engineers need full permissions to change and add secrets, when necessary.

For apps, often only Get permissions are required. Some apps might require List depending on the way the app is implemented. The app in this module’s exercise requires the List permission because of the technique it uses to read secrets from the vault.

In this article, we will focus on 2 sections, set up secrets in Key Vault and application retrieves secrets that ware saved in Key vault.

Create a Key Vault and store secrets

Creating a vault requires no initial configuration. You can start adding secrets immediately. After you have a vault, you can add and manage secrets from any Azure administrative interface, including the Azure portal, the Azure CLI, and Azure PowerShell. When you set up your application to use the vault, you need to assign the correct permissions to it

Create a Key Vault service

To create Azure Key Vault service, you can follow the steps.

From Azure Portal, search “key Vault”

click “key Vault”

Fill in all properties

Click review + create. That’s all. Quite simple, right?

Create secrets and save in Key Vault

There are two ways to create secret and save in Key vault.

Access control, Identity and Access management (IAM)
Access Policies

Using Access Control (IAM) create a secret

From Key Vault> Access Control (IAM) > Add role Assignment

In most cases, if you create and save secrets in key-vault for your users to use, you only need add the “Key vault secrets user” role assignment.

click “next” select a member or group

Pay attention to here, if your organization has multiple instances of the same services, for example, different teams are independently using different ADF instants, make sure you correctly, accurately add the right service instant to access policies.

Once it’s down, check the access.

Create a Secret

From “Key Vault” > “Object” > “Secrets” > “+ Generate/Import”

Fill in all properties, :Create”

Secrets key and value created That’s all.

If you want to update the secret, simply click the key, follow the UI guide, you will not miss it.

Click the “version” that you want to update. Update the content > apply it.

That’s all.

Using Access Policies create a secret

There is another way “Access Policies” to create a secret.

Select the permissions you want under Key permissions, Secret permissions, and Certificate permissions.

If you create a key secret for users to use in their application or other azure services, usually you give “get” and “list” in the “Secret permissions” enough. Otherwise, check Microsoft official documentation.

Under the Principal selection pane, enter the name of the user, app or service principal in the search field and select the appropriate result.

Using Azure SQL Database as an example

Caution: when you add principal, make sure you select right service instant. Especially you act as a infrastructure administer, your organization has multiple teams that they are independently using different service instants, e.g. different Synapse Workspace. select correct instant. I have been asked to help trouble shotting this scenario a few time. Azure admin says, he has added policies to key-vault, but the use cannot access there yet. that is a funny mistake made, he has added ADF to kay-vault policies, unfortunately, the ADF is NOT team A used, team B is using it. 🙂

Back on the Access policies page, verify that your access policy is listed.

Create secret key and value

We have discussed it above. Need not verbose.

Done!

Using secrets that were saved in Key Vault

Using secrets usually have 2 major scenarios, directly use, or use REST API call to retrieve the saved secret value.

Let’s use Data Factory as an example to discuss.

Scenario 1, directly use it

For example, when you create linked service to connect Azure Sql Database

You have to make sure that Key Vault’s access policies has this ADF access policies, get and list

one more example, System workspaces use key-vault.

Once again, make sure your Synapse Workspace has access policies, “Key Vault Secrets User“, “get” and “List”

Scenario 2, REST API call Key Vault to use secret

Many engineers want to call the key Vault to retrieve the secret value for a certain purpose, e.g. Synapse pipeline to get SharePoint Online list or files that resident in SharePoint Library, you need an application secret value to build the query string. Normally, the application’s secret value is saved in Key Vault. In this case, you have to make a http call to Key value.

Get a specified secret from a given key vault.
The GET operation is applicable to any secret stored in Azure Key Vault. This operation requires the secrets/get permission.

GET {vaultBaseUrl}/secrets/{secret-name}/{secret-version}?api-version=7.4

for this example,

{vaultBaseUrl} https://mainri-key-vault.vault.azure.net

{secret-name} mySecretName

{secret-version} can skipped, if you skipped it, it will use “Latest Version”

Look at the pipeline

This web activity calls Azure Key Vault to retrieve secret value.

URL: https://mainri-key-vault.vault.azure.net/secrets/mySecretName?api-version=7.4

Method: Get

Authentication: Get

Resource: https://vault.azure.net

Caution, ?api-version=7.4 is very important, you have to add it to the URL.

all done.

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Appendix:

MS document Get Secret – Get Secret

Day 10: Workflows in Azure Purview

Workflow in Purview

In Azure Purview, a workflow refers to a set of predefined, automated tasks that streamline and manage certain processes, such as approving data access requests or managing metadata curation.

Workflows in Azure Purview help automate data governance operations and improve collaboration across teams. They can include tasks like approving the registration of new data sources, assigning data stewards, or managing access to certain datasets.

Key Features of Workflows in Azure Purview:

Data Curation and Approval Processes
Workflows can be used to automate the process of curating and approving metadata changes or lineage extraction requests.
Automated Notifications
Users can receive notifications when tasks like data registration or classification have been completed or need approval.
Customizable Steps
Workflows are customizable, allowing we to define multiple stages of approval or validation depending on the organization’s governance policies.
Access Management
We can define workflows to manage how access requests are handled for sensitive data.
Approval Requests
For example, if someone registers a new data source, the workflow might require approval from a Purview Collection Administrator before proceeding.

Next step: Day 11 –

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Day 11: Purview Access Control and Management

This article describes collections and access management for your account in the Microsoft Purview governance portal.

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Next Step: Day 11:

Day 9: Managed attributes in Data Map

With “Managed Attributes” we can add own attributes (groups of attributes) and provide data stewards with the functionality to improve the content of data catalog.

create a new attribute group
create a new attribute
learn more about the available field types
assign and manage attributes for your data assets

Create a new attribute group

Create attribute group if there is no attribute group

Purview studio > Data Map > Managed attributes > New attribute group

Fill in

Create a new attribute

File in those fields
For field group: There are those can be selected

For applicable asset types, many options out of box to be used

Now, new attributes created

In the managed attribute management experience, managed attributes can’t be deleted, only expired. Expired attributes can’t be applied to any assets and are, by default, hidden in the user experience. Once an attribute created, it cannot change. Only mark them as “expired” and create a new, undated one.

Add value for managed attribute

Once a managed attribute has been created, you’ll need to add a value for each of your assets. You can add values to your assets by:

Search for your data asset in the Microsoft Purview Data Catalog
On the overview for your asset, you should see the managed attributes section with all attributes that have values. (You can see attributes without values by using the Show attributes without a value toggle.)
Select the Edit button.

Under Managed attributes, add values for each of your attributes.

If any attributes are Required you will not be able to save until you’ve added a value for that attribute.

Now, managed attribute added

Summary

Managed attribute: A set of user-defined attributes that provide a business or organization level context to an asset. A managed attribute has a name and a value. For example, ‘Department’ is an attribute name and ‘Finance’ is its value. Attribute group: A grouping of managed attributes that allow for easier organization and consumption.

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Next step: Day 10 – Collections access control and management

Day 8 – Data Lineage, Extract SQL, ADF and Synapse Pipeline Lineage

Microsoft Purview provides an overview of data lineage in the Data Catalog. It also details how data systems can integrate with the catalog to capture lineage of data.

Lineage is represented visually to show data moving from source to destination including how the data was transformed. Given the complexity of most enterprise data environments.

Microsoft Purview supports lineage for views and stored procedures from Azure SQL Database. While lineage for views is supported as part of scanning, you will need to turn on the Lineage extraction toggle to extract stored procedure lineage when you’re setting up a scan.

Lineage collection

Metadata collected in Microsoft Purview from enterprise data systems are stitched across to show an end to end data lineage. Data systems that collect lineage into Microsoft Purview are broadly categorized into following three types:

Data processing systems
Data storage systems
Data analytics and reporting systems

Each system supports a different level of lineage scope.

Data estate might include systems doing data extraction, transformation (ETL/ELT systems), analytics, and visualization systems. Each of the systems captures rich static and operational metadata that describes the state and quality of the data within the systems boundary. The goal of lineage in a data catalog is to extract the movement, transformation, and operational metadata from each data system at the lowest grain possible.

The following example is a typical use case of data moving across multiple systems, where the Data Catalog would connect to each of the systems for lineage.

Data Factory copies data from on-prem/raw zone to a landing zone in the cloud.
Data processing systems like Synapse, Databricks would process and transform data from landing zone to Curated zone using notebooks.
Further processing of data into analytical models for optimal query performance and aggregation.
Data visualization systems will consume the datasets and process through their meta model to create a BI Dashboard, ML experiments and so on.

Lineage for SQL DB views

Starting 6/30/24, SQL DB metadata scan will include lineage extraction for views. Only new scans will include the view lineage extraction. Lineage is extracted at all scan levels (L1/L2/L3). In case of an incremental scan, whatever metadata is scanned as part of incremental scan, the corresponding static lineage for tables/views will be extracted.

Prerequisites for setting up a scan with Stored Procedure lineage extraction

<Purview-Account> can access SQL Database and in db_owner group

To check whether the Account Exists in the Database


SELECT name, type_desc
FROM sys.database_principals
WHERE name = 'YourUserName';

Replace ‘YourUserName’ with the actual username you’re checking for.

If the user exists, it will return the name and type (e.g., SQL_USER or WINDOWS_USER).

If it does not exist, create one.

Sign in to Azure SQL Database with your Microsoft Entra account, create a <Purview-account> account and assign db_owner permissions to the Microsoft Purview managed identity.

You can review my previous article Configuring Azure Entra ID Authentication in Azure SQL Database If you are not sure how to enable Azure Entra ID login.


Create user <purview-account> FROM EXTERNAL PROVIDER
GO
EXEC sp_addrolemember 'db_owner', <purview-account> 
GO

replace <purview-account> with the actual purview account name.

Master Key

Check whether master exists or not.

To check if the Database Master Key (DMK) exists or not


SELECT * FROM sys.symmetric_keys
WHERE name = '##MS_DatabaseMasterKey##';Create master key
Go

if the query returns a result, it means the Database Master Key already exists.

If no rows are returned, it means the Database Master Key does not exist, and you may need to create one if required for encryption-related operations.

Create a master key


Create master key
Go

Allow Azure services and resources to access this server

Ensure that Allow Azure services and resources to access this server is enabled under networking/firewall for your Azure SQL resource.

Previously, we have discussed create a scan for Azure SQL Database at Registering Azure SQL Database and Scan in Purview, that scan progress is disabled “Lineage extraction” in that article.

To allow purview extract lineage, we need set to on

Extract Azure Data Factory/Synapse pipeline lineage

When we connect an Azure Data Factory to Microsoft Purview, whenever a supported Azure Data Factory activity is run, metadata about the activity’s source data, output data, and the activity will be automatically ingested into the Microsoft Purview Data Map.

Microsoft Purview captures runtime lineage from the following Azure Data Factory activities:

Copy Data
Data Flow
Execute SSIS Package

If a data source has already been scanned and exists in the data map, the ingestion process will add the lineage information from Azure Data Factory to that existing source. If the source or output doesn’t exist in the data map and is supported by Azure Data Factory lineage Microsoft Purview will automatically add their metadata from Azure Data Factory into the data map under the root collection.

This can be an excellent way to monitor your data estate as users move and transform information using Azure Data Factory.

Connect to Microsoft Purview account in Data Factory

Set up authentication

Data factory’s managed identity is used to authenticate lineage push operations from data factory to Microsoft Purview. Grant the data factory’s managed identity Data Curator role on Microsoft Purview root collection.

Purview > Management > Lineage connections > Data Factory > new

Validation: Purview > Data map > Collection > Root collection > Role assignments >

Check, the ADF is under “data Curators” section. That’s OK

ADF connect to purview

In the ADF studio: Manage -> Microsoft Purview, and select Connect to a Microsoft Purview account

We will see this

Once pipeline successfully runs, activity will be caught, extracted lineage look this.

that’s all for extracting ADF pipeline lineage.

Next step: Day 9 – Managed attributes in Data Map

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Day 7: Permission and Roles, Business Glossary and Collections Access Control in Purview

Role and roles and responsibilities in Purview

When working with a data catalog, different users and roles are involved. The permission concept in an Azure Purview data map is quite simple.

Permissions can be applied at collection level
Permission inheritance is possible (but can be restricted)
As of today, four roles (admins, data source admins, curators and readers) are available

The Microsoft Purview governance portal uses Collections in the Microsoft Purview Data Map to organize and manage access across its sources, assets, and other artifacts.

Collections

A collection is a tool that the Microsoft Purview Data Map uses to group assets, sources, and other artifacts into a hierarchy to manage access control. All accesses to the Microsoft Purview governance portal’s resources are managed from collections in the Microsoft Purview Data Map.

Role

The Microsoft Purview governance portal uses a set of predefined roles to control who can access what within the account.

These roles are currently:

Collection administrator
Data curators
Data readers
Data source administrator
Insights reader
Policy author
Workflow administrator

Detail definition please review Microsoft Purview official documentation.

Here is a summary of roles and responsibilities

Add or move Role assignments

Select the Role assignments tab to see all the roles in a collection. Only a collection admin can manage role assignments.

Type in the textbox to search for users you want to add to the role member. Select X to remove members you don’t want to add.

Restrict inheritance

Collection permissions are inherited automatically from the parent collection. For example, any permissions on the root collection (the collection at the top of the list that has the same name as your account), will be inherited by all collections below it. You can restrict inheritance from a parent collection at any time, using the restrict inherited permissions option.

After restriction, inherited members are removed from the roles expect for collection admin.

Select the Restrict inherited permissions toggle button again to revert.

Business Glossary

A glossary provides vocabulary for business users. It consists of business terms that can be related to each other and allow them to be categorized so that they can be understood in different contexts.

A business glossary is a collection of terms. Each term represents an object in an organization and it’s highly likely that there are multiple terms representing the same object.

There are some out of box templates that have predefined.

Add new glossary

Purview studio > Data Catalog > Glossary > new Glossary

Fill in all properties.

Next step: Day 8 – Data Lineage – Extract SQL, ADF and Synapse Pipeline Lineage

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Day 6: Registering Azure Synapse Analytics workspaces and scan in Microsoft Purview

Previously, we’ve talked about how Purview connect to ADLS and SQL Database, scan in Purview. Today, we focused on Azure Synapse Analytics with Purview.

A comprehensive data analytics solution can include many folders and files in a data lake, and multiple databases that each contain many tables, each with multiple fields. For a data analyst, finding and understanding the data assets associated with a Synapse Analytics workspace can present a significant challenge before any analysis or reporting can even begin.

As we know the Azure Synapse Analytics is a platform for cloud-scale analytics workloads that process data in multiple sources; including:

Relational databases in serverless and dedicated SQL pools
Files in Azure Data Lake Storage Gen2

Microsoft Purview can help in this scenario by cataloging the data assets in a data map, and enabling data stewards to add metadata, categorization, subject matter contact details, and other information that helps data analysts identify and understand data.

Before you scan Synapse workspace, you need Azure Synapse Analytics connects Purview account.

Azure Synapse Analytics connects to Purview account.

Synapse Studio > Manage > External connection > Microsoft Purview

after you click “apply” you will see:

Select “Purview account” tag

Successfully connected with Purview.

To validation, we check what we have in ADLS and SQL Database.

We have in ADLS and Azure SQL Database. There are one table called “dep” in the SQL Database, 3 files in the ADLS.

There is one table in SQL Database:

and there are 3 file related the key word “dep” in ADLS,

using Azure Storage Explore:

Let’s search “dep” the key from Synapse Studio.

Synapse Studio > from the dropdown > select “Purview” > type “dep”

We find out the objects related to the key words – “dep”

A table in SQL Database, 3 files in ADLS.

Great, we successfully connected to Purview.

choose either of them to view in detail

There are so many powerful and interesting functions regarding the “Searching”, “discovering”, we will talk about them late.

Now, let’s switch to Purview studio.

Register Synapse Analytics Workspace

Assuming you have created Collects, we directly jump into register Azure Synapse Analytics Workspace (ASA).

Purview Studio > Data Map > Data Source

After filling in above values, click “register”, you will this

After registering the sources where your data assets are stored, you can scan each source to catalog the assets it contains. You can scan each source interactively, and you can schedule period scans to keep the data map up to date.

You may or may not see this error or alerts:

Read:

“Failed to load serverless databases from Synapse workspace, please give the managed identity of the Microsoft Purview account permissions or apply the correct credential to enumerate serverless databases.”

If you see it, you need create a login account for purview account to connect Serverless SQL:

Create Serverless SQL database login account for Purview

— create a login for purview login to Serverless SQL database


create login [mainri-purview] from external provider;

Synapse Studio > Develop > SQL Script >
select: “connect to Built-in” and use database “master”

Grant purview login account Sysadmin privilege

Add managed identity to the storage account

Then, add managed identity to the storage account.

From Azure portal > storage account > Access Control (IAM)

Select Role assignments tag

Add role assignments

Give the “Storage Account Contributor” role

Then, select “Member” tag:

Select “Managed Identity”, fill in all properties, Find out the purview account

Now, the role assignments added.

If you have dedicated SQL pool, we need repeat these.

Create Serverless SQL database login account for Purview
Grant purview login account Sysadmin privilege

Let’s test the connection

From Purview studio > scan

we got failed alert.

“Failed to validate the resource expansion. Either the expandable resource has no resources or the Purview account’s MSI has not been assigned to the ‘Reader’ role on it.”

Go back to Synapse portal

Azure Portal > Synapse workspace > Access control (IAM) > Add role assignments

add “read” role

Add “managed Identity” member – Purview

Check Purview access,

we can see Mainri-purview assignments – mainri-asa-workspace has “read” role (my Synapse workspace named “mainri-asa-workspace”)

Go to Purview Studio test connection again.

Great! We successful connect to Synapse workspace.

We have gotten access to SQL; we’ve got access to storage account. we have add “read” role assignment to Purview

Alright, we are ready to go – scan.

Scan Synapse workspace

Select a scan rule set

If you like, you are able to add even more new scan rule set at this step.

For this demonstration, we select default scan rule set.

Set a scan trigger

We ca either scan once or schedule and recurring scan on schedule.

Monitoring the scan progress ….

Once the process done, we will see this:

Alright, we have done the Purview for scanning Azure Synapse Workspace. Now, we have those source in our Azure purview.

Next step: Day 7 – Day 7: Permission and Roles, Business Glossary and Collections Access Control in Purview

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Day 5: Registering Azure SQL Database and Scan in Purview

In the previous article, we discussed Managed Identity, registering ADLS, and scanning it in Azure Purview. In this article, I will focus on scanning an Azure SQL Database, including how to register and scan it in Azure Purview. The process differs from that of ADLS. You will be required to provide Azure SQL Database credentials.

We will learn the best practice of storing the Azure SQL Database credentials in an Azure Key Vault and use that Key Vault in Purview.

Azure Key Vault provides a way to securely store credentials, secrets, and other keys, but your code must authenticate to Key Vault to retrieve them.

We must follow these steps to register and scan Azure SQL Database:

Grant your Azure Account access to Key Vault by adding a new access policy. We will have to grant all the secret permissions.
Grant Purview Managed identity access to Key Vault by adding a new access policy. Here we will have to grant Get and List permissions so purview can get(retrieve) and list down all the secrets.
Generate a Secret for SQL Admin in Azure Key Vault. This secret will be used to log in to Azure SQL DB.
Add SQL Credentials (created above) in Purview so we can use the same credential.
Register Azure SQL DB in Microsoft Purview.
Scan Azure SQL Database as a data source with Azure Key Vault Credentials.
Verify that Purview is able to see tables in the Azure SQL database.

Discover and govern Azure SQL Database in Microsoft Purview

This article outlines the process to register an Azure SQL database source in Microsoft Purview. It includes instructions to authenticate and interact with the SQL database.

When you’re scanning Azure SQL Database, Microsoft Purview supports extracting technical metadata from these sources:

Server
Database
Schemas
Tables, including columns
Views, including columns (with lineage extraction enabled, as part of scanning)
Stored procedures (with lineage extraction enabled)
Stored procedure runs (with lineage extraction enabled)

When you’re setting up a scan, you can further scope it after providing the database name by selecting tables and views as needed.

Update firewall settings.

If your database server has a firewall enabled, you need to update the firewall to allow access. Simply show you here.

Azure Portal > SQL Database > Set Server Firewall> Security > Networking

Check “Allow Azure services and resources to access this server”

Key Vault Set up

Create access policies

From Azure Portal > [ your key vault ] > Access Policies

Secret permissions

select all.

Principal

find out “[your purview]”.

Type your purview account to search. For me, I use “mainri-purview” as an example.

Click “next” … create

Create a Secrets

[your key vault] > Objects > Secrets

Generate a secret

We completed the Azure Key vault configuration.

It’s time for configuration of Purview for scanning SQL Database.

Configure authentication for a scan

Azure Purview > Management > Credentials

Manage Key vault connection

Add a new.

If you do not have linked service to your key vault, select “new” create a new one.

Fill in all values, click “Create” , Key-vault connection created.

Then, you will back to “new credential” screen.

Create Credential

Fill in all values, click “create”

Now, the credential created

Setup Scan

Did you remember we have created collections previously?

Go back: Data Map > Data Source >

Create a scan

Full screen

Pay attention here: Lineage extraction (preview)

I choose “off”.

To successfully turn on Lineage extraction, you must do the following:

Provide the db_owner role in Azure SQL Database for Microsoft Purview MSI
Run “create Master Key” in Azure SQL Database (only if not already exists)

Scope your scan

Check to see what Entity you want to scan.

Select a scan rule set.

At here, you are able to add new scan rule set if you like. For me I selected azure purview system default SQL data warehouse scan rule set.

Set a scan trigger

You can either schedule the scan or once. This is very straight forward.

Once the process is complete, you can view detail

Alright, we’ve done the scan SQL Database in Azure Purview.

Next step: Day 6 – R egistering Azure Synapse Analytics workspaces and scan in Microsoft Purview

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)

Day 4: Registering ADLS Gen2 and Scan in Purview

Before we scan data sources in Azure Purview, we have to register data
resources that to be scanned.

First, we will learn the concept of managed identity and how Azure purview uses it.

Second, we will learn the steps involved in registering ADLS Gen2.

Azure Purview Managed Identity

We will use Azure Purview Managed Identity that is an Azure resource feature found in Azure Active Directory (Azure AD). The managed identities for Azure resources feature is free and there’s no additional cost.

We can use the identity to authenticate to any service that supports Azure AD authentication, including Key Vault, without any credentials in your code. We will use Azure Purview Managed Identity.

Let’s register source data first.

We have to follow these steps to register and scan ADLS Gen 2 account:

Grant the Azure Purview Managed Identity access to ADLS Gen2 so purview can have access to it. Preview managed identity should have storage blob reader permission on ADLS Gen2
Scan ADLS Gen2 with the Purview Managed Identity registered in step 1

Register an Azure Data Lake Storage Gen2 account

Azure Portal > ADLS > Access Control > Add role assignment

> storage BLOB Data

>Select managed Identities

>next > next > review + assign

Now, you can “check access”, Now, you can “check access”. It’s added/

Scan data resource

We have talked about creating collection previously, if you need please review “How Microsoft Purview works – Data Source, Rule Sets, and Classification“

now, it’s time for scanning. from Azure Purview Studio

Scope your scan

You will see “scope your scan”. Now we can see all my data and directory structure on ADLS appear.

Select scan rule set

We have talked the rile sets in last article. You are able to add even more new scan rule set at this step if you like, or use default Azure System default scan rule set.

Set a scan trigger

Click the continue, you can setup trigger to scan, either once or recurring.

We complete the scan configuration. We have chance to review the configuration if ok save and run the scan progress or back to change it.

You can see this by clicking “view detail”

You will see this once the scan progress completed.

Next step: Registering Azure SQL Database and Scan in Purview

Please do not hesitate to contact me if you have any questions at William . chen @ mainri.ca

(remove all space from the email account 😊)