ETLs

Azure Synapse Connector

An out-of-the-box connector is available for Azure Synapse. The  Azure Synapse Connector is used to pull the metadata existing in the Azure Synapse database and helps the users to crawl the metadata and profile the sample data and build lineage to view the movement of the crawled data displaying the relationship between the objects and its profile statistics


Crawling:  Crawling is a process of collecting information about data from various data sources like on-premise and cloud databases, Hadoop, visualization software, and file systems. When an OvalEdge crawler connects to a data source, it collects and catalogs all the data elements (i.e., metadata) and stores it in the OvalEdge data repository. OvalEdge crawlers can be scheduled to scan the databases regularly


Data Sources: OvalEdge crawler integrates with various data sources to help the users to extract metadata and build a data catalog. In this document, you can see how to make a connection to your Azure Synapse and crawl the Views, Stored Procedures, and Functions.


Connect to the Data: Before you can crawl and build a data catalog, you must first connect to your data. OvalEdge requires users to configure a separate connection for each data source type. The users must enter the source credentials for each type of connectivity. Once a data connection is made, a simple click of the Crawl button starts the crawling process.                 

 Prerequisites

The following are prerequisites for connecting to the Azure Synapse


The APIs/ drivers used by the connector are given below:

Driver / API

Version

Details

Drivers

Microsoft JDBC driver 8.4.1

Is JDBC 4.2 compliant, uses JDK 8.0

User Permission

An admin/service account for crawling and building lineage. The minimum privileges required are:

Operation

Access Permission

Connection validate

Read

Crawl datasets

Read


Technical Specification

The connector capabilities are shown below:

Crawling

Feature

Supported Objects

Remarks

Crawling

Tables

-

Table Columns

Supported Data Types:

Bigint, Char, Date, Datetime, Datetime2, Decimal (P, S), Float, Int, Money, Nchar, Ntext, Numeric (P, S), Nvarchar, Real, Smalldatetime, Smallint, Smallmoney, Time, Tinyint, Varchar

Views

-

Stored Procedures

-

Functions

-

Profiling

Feature

Supported Objects

Remarks

Table Profiling

Row Count, Columns Count, View Sample data

-

View Profiling

Row Count, Columns Count, View sample data

View is treated as a table for profiling purposes.

Column Profiling

Min, Max, Null Count, Distinct, Top 50 values

-

Full Profiling

Supported

-

Sample Profiling

Supported

-


Lineage Building

Lineage entities

Details

Table Lineage

Supported

Column Lineage

Supported

Lineage Sources

Stored Procedures, functions, triggers, views, SQL queries (from Query Sheet)

Querying 

Operation 

Details

Select

Supported

Insert

Not supported, by default.

Update

Not supported, by default.

Delete

Not supported, by default.

Joins within database

Supported

Joins outside database

Not supported

Aggregations

Supported

Group By

Supported

Order By

Supported


By default, the service account provided for the connector will be used for any query operations. If the service account has to write privileges, then Insert / Update / Delete queries can be executed.

Connection Details

To connect to the Azure Synapse using the OvalEdge application, complete the following steps.

  1. log in to the OvalEdge application
  2. Navigate to Administration > Crawler module.
  3. click on the + icon, and the Manage Connection with Search Connector pop-up window is displayed. 
  4. Select the connection type as Azure Synapse. The Manage Connection with Azure Synapse specific details pop-up window is displayed.

Field Name

Mandatory/Optional

Description

Connection Type

 

Azure Synapse

Authentication

Mandatory

 

License Type

Mandatory

You can choose the License Type.

Connection Name

Mandatory

Enter the name of the connection, the connection name specified in the Connection Name textbox will be a reference to the Azure Synapse database connection in the OvalEdge application.

Server

Mandatory

Database instance URL (on-premises/cloud-based)

Example: 34.54.23.43

Port

Mandatory

 

Database

Mandatory

Name of the database to connect.

Driver

 

JDBC driver name for Azure Synapse server. It will be auto-populated.

Username

Mandatory

User account login credential 

Password

Mandatory

Password

Connection String

 

Azure Synapse connection string. Set the Connection string toggle button to automatic to get the details automatically from the credentials provided. Alternatively, you can manually enter the string.

Plugin Server

 

Provide the server name if you are running this as a plugin.

Plugin Port

 

Provide the port number on which the plugin is running.

Governance Roles

Mandatory

From the dropdown list, select Stewards, Custodian and Owner.

No. of Archived Objects

 

-

Select Bridge

 

Select option NO Bridge if no bridge is available for the connector


Connection Settings 

Crawler


Property

Description

 

Crawler Options

Tables, views & columns

By default, it is selected true

Relationships

Need to select for crawl relationships

Crawler rules

Default includes tables regex has  regex = (.*), which can crawl all the tables

 

Profiler


Property

Description

 

Profile Options

Tables and columns

By default, it is selected true

Profile Rules

Views and columns

By default, it is selected true

Default includes tables regex has  regex = (.*), which can profile all the tables

Profile Rules