An out-of-the-box connector is available for Azure Synapse. The Azure Synapse Connector is used to pull the metadata existing in the Azure Synapse database and helps the users to crawl the metadata and profile the sample data and build lineage to view the movement of the crawled data displaying the relationship between the objects and its profile statistics
Crawling: Crawling is a process of collecting information about data from various data sources like on-premise and cloud databases, Hadoop, visualization software, and file systems. When an OvalEdge crawler connects to a data source, it collects and catalogs all the data elements (i.e., metadata) and stores it in the OvalEdge data repository. OvalEdge crawlers can be scheduled to scan the databases regularly
Data Sources: OvalEdge crawler integrates with various data sources to help the users to extract metadata and build a data catalog. In this document, you can see how to make a connection to your Azure Synapse and crawl the Views, Stored Procedures, and Functions.
Connect to the Data: Before you can crawl and build a data catalog, you must first connect to your data. OvalEdge requires users to configure a separate connection for each data source type. The users must enter the source credentials for each type of connectivity. Once a data connection is made, a simple click of the Crawl button starts the crawling process.
Prerequisites
The following are prerequisites for connecting to the Azure Synapse
The APIs/ drivers used by the connector are given below:
Driver / API |
Version |
Details |
---|---|---|
Drivers |
Microsoft JDBC driver 8.4.1 |
Is JDBC 4.2 compliant, uses JDK 8.0 |
User Permission
An admin/service account for crawling and building lineage. The minimum privileges required are:
Operation |
Access Permission |
---|---|
Connection validate |
Read |
Crawl datasets |
Read |
Technical Specification
The connector capabilities are shown below:
Crawling
Feature |
Supported Objects |
Remarks |
---|---|---|
Crawling |
Tables |
- |
Table Columns |
Supported Data Types: Bigint, Char, Date, Datetime, Datetime2, Decimal (P, S), Float, Int, Money, Nchar, Ntext, Numeric (P, S), Nvarchar, Real, Smalldatetime, Smallint, Smallmoney, Time, Tinyint, Varchar |
|
Views |
- | |
Stored Procedures |
- |
|
Functions |
- |
Profiling
Feature |
Supported Objects |
Remarks |
---|---|---|
Table Profiling |
Row Count, Columns Count, View Sample data |
- |
View Profiling |
Row Count, Columns Count, View sample data |
View is treated as a table for profiling purposes. |
Column Profiling |
Min, Max, Null Count, Distinct, Top 50 values |
- |
Full Profiling |
Supported |
- |
Sample Profiling |
Supported |
- |
Lineage Building
Lineage entities |
Details |
---|---|
Table Lineage |
Supported |
Column Lineage |
Supported |
Lineage Sources |
Stored Procedures, functions, triggers, views, SQL queries (from Query Sheet) |
Querying
Operation |
Details |
---|---|
Select |
Supported |
Insert |
Not supported, by default. |
Update |
Not supported, by default. |
Delete |
Not supported, by default. |
Joins within database |
Supported |
Joins outside database |
Not supported |
Aggregations |
Supported |
Group By |
Supported |
Order By |
Supported |
By default, the service account provided for the connector will be used for any query operations. If the service account has to write privileges, then Insert / Update / Delete queries can be executed.
Connection Details
To connect to the Azure Synapse using the OvalEdge application, complete the following steps.
- log in to the OvalEdge application
- Navigate to Administration > Crawler module.
- click on the + icon, and the Manage Connection with Search Connector pop-up window is displayed.
- Select the connection type as Azure Synapse. The Manage Connection with Azure Synapse specific details pop-up window is displayed.
Field Name |
Mandatory/Optional |
Description |
---|---|---|
Connection Type |
Azure Synapse |
|
Authentication |
Mandatory |
|
License Type |
Mandatory |
You can choose the License Type. |
Connection Name |
Mandatory |
Enter the name of the connection, the connection name specified in the Connection Name textbox will be a reference to the Azure Synapse database connection in the OvalEdge application. |
Server |
Mandatory |
Database instance URL (on-premises/cloud-based) Example: 34.54.23.43 |
Port |
Mandatory |
|
Database |
Mandatory |
Name of the database to connect. |
Driver |
JDBC driver name for Azure Synapse server. It will be auto-populated. |
|
Username |
Mandatory |
User account login credential |
Password |
Mandatory |
Password |
Connection String |
Azure Synapse connection string. Set the Connection string toggle button to automatic to get the details automatically from the credentials provided. Alternatively, you can manually enter the string. |
|
Plugin Server |
Provide the server name if you are running this as a plugin. |
|
Plugin Port |
Provide the port number on which the plugin is running. |
|
Governance Roles |
Mandatory |
From the dropdown list, select Stewards, Custodian and Owner. |
No. of Archived Objects |
- |
|
Select Bridge |
Select option NO Bridge if no bridge is available for the connector |
Connection Settings
Crawler
Property |
Description |
|
Crawler Options |
Tables, views & columns |
By default, it is selected true |
Relationships |
Need to select for crawl relationships |
|
Crawler rules |
Default includes tables regex has regex = (.*), which can crawl all the tables |
Profiler
Property |
Description |
|
Profile Options |
Tables and columns |
By default, it is selected true |
Profile Rules |
Views and columns |
By default, it is selected true |
Default includes tables regex has regex = (.*), which can profile all the tables |
Profile Rules |