Azure Data Factory/Synapse pipelines -Email

Azure Data Factory/Synapse pipelines have become one of the most popular cloud-based ETL tools. They offer a data integration and transformation layer along with version control and CI/CD capabilities.For proactive monitoring of the pipelines,we require some type of alert mechanism which can beemail/teams notifications. These are important for monitoring the pipeline runs, to identify and highlight any potential data load failures or anomalies to the concerned teams so appropriate action can be taken.

Some of the documented and explored options for sending pipeline success/failure notification notifications are – logic apps, log analytics and data factory monitor(alerts).

  1. Workflows can be created in logic apps which will capture and send the error messages via email.
  2. Log analytics can be set up on the workspace where the pipelines exist to capture all pipeline run logs and alert rules can be set up to send failures.
  3. Data factory monitor can be used to send alerts when certain events occur on pipeline or activity level.
FeaturesLogic AppsLog AnalyticsData factory Monitor
Ease of implementationHigh. A ready-to-use template is available.Low. Log Analytics workspace set up needs to be performed and the alerts need to be captured based on error/failure logs.Moderate. Alerts & Metrics tab has the functionality to send emails to add action groups, in case of events which are configured in the alert. 
ReusabilityYes. Can be reused for multiple workspacesNo. Duplicate efforts need to be madeNo. Duplicate efforts need to be made
Maintenance and version controlModerate. No version control. And since the logic is built in the logic apps interface, need to switch from pipeline environment to logic apps forany changes required.Moderate. No version control. And since the logic is built in the log analytics workspace, need to switch from pipeline environment to log analytics for any changes required.Moderate. No version control. Difficult to keep track of changes made in the alert logic, frequencies, and threshold etc.
CostModerate. Additional service costs since a new Azure resource needs to be commissioned.High. Additional service costs since Azure Monitor needs to be commissioned.Low. Cost is based on the type and number of alerts sent.
AvailabilityAvailable for Data Factory and Synapse pipelines.Available for Data Factory and Synapse pipelines.Available for Data Factory. Synapse pipelines only have alerts for pipeline run completion. It does not have pipeline and activity granularity or success/failure metrics.

Microsoft Graph API comes into the picture as a single endpoint that has all the capabilities to interact with Microsoft cloud, including Microsoft 365, Windows, and Enterprise Mobility + Security.The objective can be achieved using a service principal which has access to send mail using mailbox.

Microsoft Graph API

The Microsoft Graph API approach has various advantages. It gives the user the ability to setup custom and dynamic email content and since the logic can be developed directly in the pipelines, it can be tightly coupled with the ETL workflow which allows for easier scheduling. This also means that version control can be used.

When setting up the permissions for the service principal, the admin can also limit the access to certain mailboxes. This is optional but advised since otherwise the service principal will have access to all mailboxes in the organization. And the only cost associated with this approach is the user license cost for the service principal.

The prerequisite to this approach is to create an app registration with permission to send mails in Azure. The following steps can be followed to achieve this –

1. Go to portal.azure.comand search for App Registrations

3.1-Prerequisites

2. Click on + New Registration

App Registration

3. Fill in the below details and click on Register.

Register app

4. Once the app is registered, copy the below two values for later usage.

synapse-pipeline

5. Now create a secret for the application that can be used for authorization in the pipeline.

Email notification
Seceret Key

6. Copy the secret value (Note: Client secret values cannot be viewed, except for immediately after creation. Be sure to save the secret when created before leaving the page.)

Copy the secret value

7. Lastly, go to the registered application > API Permissions > Add a permission> Microsoft Graph

Microsoft graph
Request API

8. Select Application permissions >Mail.Send> Add permissions.

9. Once added, click on Grant admin consent so that the application permissions will be granted.

For creating the pipeline in Data Factory or Synapse workspace, only two steps arerequired –

Create two subsequent web activities like above image and fill the below given details in the Settings tab.

  1. Authorization request – Fetch AAD token for service principal
    a. URL : https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token
    b. Method :POST
    c. Body :
client_id={client_id}
&scope=https%3A%2F%2Fgraph.microsoft.com%2F.default
&client_secret={client_secret}
&grant_type=client_credentials

d. Authentication : None
e. Headers :
i. Host :login.microsoftonline.com
ii. Content-Type :application/x-www-form-urlencoded

  1. E-mail – Send mail using Graph API
    a. URL :https://graph.microsoft.com/v1.0/users/{sender_email_id}/sendMail
    b. Method :POST
    c. Body :
{
  "message": {
    "subject": "Test for pipeline 2",
    "body": {
      "contentType": "Text",
      "content": "Sending text from Synapse pipeline."
    },
    "toRecipients": [
      {
        "emailAddress": {
          "address": "shivani.s@mmjs.co"
        }
      }
    ]
  },
  "saveToSentItems": "false"
}

d. Authentication : None
e. Headers :
i. Content-Type :application/json
ii. Authorization : Bearer {access token}

Microsoft Graph API has extended applications and can be utilized for a variety of use cases which range from sending mails to handling Outlook calendar and mails to managing Microsoft Teams collaboration.

It can also be used to raise service requests using emails where an alert mail will directly create an enterprise service desk ticket. The ticket will automatically await action from the concerned teams and can be easily tracked for a solution.

References

  1. Azure Data Factory pipelines
    https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities?tabs=data-factory
  2. Logic apps notifications
    https://learn.microsoft.com/en-us/azure/data-factory/how-to-send-email
  3. Log analytics notifications
    https://medium.com/@bhavanisgsits/azure-synapse-pipeline-monitoring-and-alerting-76c490386bfc
  4. Data factory monitor – alerts
    https://learn.microsoft.com/en-us/azure/data-factory/monitor-visually#alerts
  5. Web activity
    https://learn.microsoft.com/en-us/azure/data-factory/control-flow-web-activity
  6. Limit access of service principal to certain mailboxes
    https://learn.microsoft.com/en-us/graph/auth-limit-mailbox-access

Leave a Reply

Recent Posts