SQL Server Connector

An out-of-the-box connector is available for SQL Server databases. The SQL server is a Relational Database Management System (RDBMS). This Connector is used to pull the metadata existing in the SQL server database and helps the users to crawl the metadata and profile the sample data and build lineage to view the movement of the crawled data displaying the relationship between the objects and its profile statistics.

15-2

The connectivity to SQL Server is via the JDBC driver, which is included in the platform. 

The connector currently supports the following versions of SQL Server:

Edition

Version

Comments

Express

15.x

Supported

Web

 

Supported

Standard

 

Supported

Enterprise

 

Verified

Pre-requisites

The following are prerequisites required for the connection of the SQL Server database.

Sl.No

Item List

Description

Microsoft 

JDBC driver

7.4

Is JDBC 4.2 compliant, uses JDK 8.0

Download - JDBC Driver for SQL Server | Microsoft Docs

Note: Latest version is 9.2.

JTDS driver

1.3.1

jTDS - SQL Server and Sybase JDBC driver - Browse Files at SourceForge.net

This driver is used for supporting SQL Server JDBC Authentication on a 

Linux environment.

To use the connector, the following need to be available:

  • An admin/service account with read privileges is required for Crawling and Profiling.
  • JDBC driver is provided by default. In case it needs to be changed, add SQL server client drivers into the OvalEdge Jar path to communicate to the SQL Server database.
    Check the Configuration section for further details on how to add the drivers to the jar path.
  • Install supporting ‘.dll’ for Microsoft JDBC driver. 
    • .dll file must be copied in the apache tomcat, /Catalina/. 
This file will be provided by OvalEdge during installation.

User Permissions


The following are minimum permissions and privileges required for the user to crawl the database objects, and profile sample data. 

OE ADMIN - Superuser.

The following are the minimum permissions required for OvalEdge to validate the SQL Server connection. 

Permission: READ (Meta Read and Data Read)Need to specify minimum (Meta Read)Roles: Crawler Admin 

Super User: OE ADMIN

Note: Navigate to Configuration  > Users & Roles for roles and permission.

The minimum privileges required for a cluster user are

Operation 

Access Permission

Connection Validation

SELECT

Crawl Schemas

SELECT

Crawl Tables

SELECT

Profile Schemas, Tables

SELECT

Lineage Building 

SELECT


Technical Capabilities

Features

Supported Objects

Remarks

Crawling

Tables, Table Column, Views, Function, Stored 

procedures, Function, 

Trigger, Roles, Users,

Permissions, and Usage 

Statistics

Supported Data types:

Bigint, Char, Date, Datetime, Datetime2, Decimal (P, S),

Float, Int, Money, Nchar, Ntext, Numeric (P, S), Nvarchar,

Real, Smalldatetime, Smallint, Smallmoney, Time, Tinyint,

Varchar

Profiling

Table Profiling

The output will be Row count, Columns count, 

and View sample data

Views Profiling

The output will be Row count, Columns count, 

and View sample data

Note: View is treated as a table for profiling purposes.

Column Profiling

The output will be Min, Max, Null count, distinct, top

50 values

Full Profiling

Supported

Sample Profiling

Supported

Lineage Building

Table lineage

Supported

Column lineage

Supported

Lineage Sources

Stored procedures, functions, triggers, views, SQL queries 

(from Query Sheet), query logs

Query Execution

Select

Supported

 

Insert

Not supported, by default.

 

Update

Not supported, by default.

 

Delete

Not supported, by default.

 

Joins within database

Supported

 

Joins outside database

Not supported

 

Aggregations

Supported

 

Group By

Supported

 

Order By

Supported

 

Note: By default, the service account provided for the connector will be used for any query operations. If the service account user is having write privileges, then Insert / Update / Delete queries can be executed.

Connection Details

To connect to the SQL server using the OvalEdge application, complete the following steps.

  1. Login to the OvalEdge application
  2. Navigate to Administration > Connector module.
  3. Click on the + icon, and the Add Connection with Search Connector pop-up window is displayed.
  4. Select the connection type as SQL Server. The Manage Connection with SQL Server specific details pop-up window is displayed.
  5. The fields in the connector form are explained below:

    Field Name

    Mandatory/Optional

    Description

    Connection Type

    Mandatory

    By default, the selected connection type is displayed as SQL Server.

    Authentication Type

    Mandatory

    Select the Authentication type from the drop-down menu. (SQL server authentication)

    License Type

    Mandatory

    Select the license type as Standard/Auto Lineage.

    Connection Name

    Mandatory

    Enter the name of the connection, the connection name specified in the Connection Name textbox will be a reference to the SQL server connection in the OvalEdge application.

    Example: SQL Server Connection1

    Server

    Mandatory

    Enter the fully qualified server name or IP address related to the SQL server. 

    Example: 12.34.56.78

    Port

    Mandatory

    By default, the port number related to the SQL server is displayed. If needed, the port number can be modified.

    Database

    Mandatory

    Enter the name of the database which users want to crawl.

    Driver

    Mandatory

    By default, the driver details associated with the SQL server are displayed, and it is not editable.

    Username

    Mandatory

    By default, the user credentials through which the OvalEdge application is logged in are displayed in the Username textbox related to the SQL server authentication. 

    Password

    Mandatory

    By default, the user credentials through which the OvalEdge application is logged in are displayed in the Password textbox.

    Connection String

    Optional

    Connection string has auto and manual modes. By default, it is set to Auto mode, where connection string details are fetched automatically and displayed in the Connection string textbox. Alternatively, the user can manually enter the string by setting the toggle button to manual mode. 

    Plugin Server

    Optional

    Enter the Server Name if the connection is running as the plugin

    Example: ovaledgejdbcsql.csklygkw3.dxuseast-1rds.amazonnows.com

    Plugin Port

    Optional

    Enter the port number on which the plugin is running.

    Example: 3306

    Default Governance Roles

    Mandatory

    From the dropdown list, select Stewards, Custodian and Owner.

    No of archive objects

    Mandatory

    By default, the number of archive objects is set in the disable mode. To enable the archive option click on the Archive toggle button and enter the number of objects to be archived.

    Select Bridge

    Optional 

    Select option NO Bridge if no bridge is available for connector 

  6. Once after entering the connection details in the required fields, Click on the Save button or Save & Configure to establish the connection and configure the connection settings. Here when you click on the Save & Configure button, the Connection Settings pop-up window is displayed. Where you can configure the connection settings for the selected Connector.

    Note: The Save & Configure button is displayed only for the Connectors for which the settings configuration is required.

  7. Click on the Validate button to validate the connection details. 

    Note: It is up to the user's wish, you can save the connection details first, or you can validate the connection first and then save it. 

     

    Note: An SQL server connection can be authenticated using SQL Server Authentication, Windows Authentication, or Azure Active Directory - Password. The field for the manage connection pop-up window will differ as per the selected authentication type.

    Error Validation Details 

    The following are the possible error messages encountered during the validation. 

    Sl.No

    Error Messages

    Description

    1

    Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: Cannot create PoolableConnectionFactory (Login failed for user 'ovaledge'. ClientConnectionId:2d7324ab-51e3-432e-995c-74a000910446)

    Incorrect Password

    2

    Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: Cannot create PoolableConnectionFactory (Login failed for user 'ovaled'. ClientConnectionId:f06bb2e4-da88-4d18-9dc6-222d67cedee5)

    Invalid User Name

    3

    Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: Cannot create PoolableConnectionFactory (The TCP/IP connection to the host oval-sqlserver.csklygkwz3dx.us-east-1.rd, port 1433 has failed. Error: "oval-sqlserver.csklygkwz3dx.us-east-1.rd. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".)

    Incorrect Server

    Connection Settings

    Once the connection is established successfully, there are various settings provided to fetch and analyze the information from the data source. The connection setting includes Crawler, Profiler, Remote Access, Access Instruction, Lineage, and Others. Based on the connector selected, the option on the setting page will differ. 

    To configure the connector settings, select the SQL server Connection Name from the Crawler Information page and click on the Nine dots option and select the Settings options. The Connection Settings pop-up window is displayed. 

    Crawler

    Crawler setting has various settings tabs for crawling and profiling options. The crawler options are available for all the connections. Based on the selected connection, the options will differ. It is required to enable one of the crawler options.

    Crawler options

    Tables, Views, and Columns: This crawling will discover the tables, views, and Columns and bring them into OvalEdge. This is the Default option for crawling

    Procedures, Functions & Views Source Code: The crawling will discover the procedures, functions, and views source code within the schemas selected and bring them into OvalEdge. 

    Relationship: This crawl job will look at the schemas selected and determine relationships between the tables and bring this information into OvalEdge.

    Crawler Rule

    Include Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are included for crawling.

    Exclude Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are excluded for crawling.

    Query Log Settings

    The purpose of the query log setting is to fetch the query from the data source. You can configure different options under these settings. It includes :

    (i) Query:

     It will display the query fetched from the backend and make it available to the front end. It fetches (if any) list of queries that have undergone changes based on the criteria given in the look-back period. 

    (ii) Look Back Period: Look back period feature allows the query box to save the queries for those specified days and allows you to fetch queries that were processed in the backend for that number of days ago. For example, the lookback period can be set to 1 day, 2 days, or more. The maximum Look back period is 7 days.

    (iii) Query Type: You can include query types such as Select, Insert, Update, and Delete

    (iv) Exclude Users: In the Query Logs, the administration has the option to exclude the users who are not required because If any user name is given in the excluded user, the query log of the executed query by that particular user is excluded from fetching. 

    (v)Schema: Select the schema to apply query log settings. If you leave this blank, it will fetch the query log based on all the schema available for that database connection.

    Profiler

    The process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the data source's quality before using it in an analysis. Profiling is always optional; crawling can be run without also profiling. However, if profiling is desired, there are a few options on how to profile the metadata/data. It includes:

    Profiler Configurations Settings

    Details

    Tables and Columns

    Select the checkbox to profile the tables and columns that are existing in the SQL server database into the OvalEdge.

    Note: By default, the checkbox for Tables and Columns is selected.

    Views and Columns

    Select the checkbox to profile the views and columns that are existing in the SQL server database in the OvalEdge.

    Note: By default, the checkbox for Views and Columns is selected.

    Profile Rules: Include Regex

    Enter the specific table, view, and column names that start with, end with, or have middle characters that are included for profiling.

    Profile Rules: Exclude Regex

    Enter the specific table, view, and column names that start with, end with, or have middle characters that are excluded for profiling.

    To configure the Profile Setting,

    Click on the Edit icon that allows the Admin user to configure the profiler setting for the selected data source. There are many attributes you can specify in the profile settings. 

    The Profiler setting attributes are as follows,

    Parameters

    Description

    Order

    Order number is a sequence in which the profiling is done. 

    Day

    The day of a week in which profiling is set to run.

    Start/End Time

    The start and end time which profiling is set to perform.

    Number of Threads

    A thread is a process where a query is executed on a database to do single or multiple tasks. The number of threads determines the number of parallel queries that are executed on the data source. 

    Profile Type

    There are four main types of data profiling. 

    There are four main types of data profiling. 

    • Sample - For Sample profiling, 

    The profiling is performed based on a given Sample Profile Size. The data on columns (like Min, Max, Distinct, Null Count, etc.) will be different compared to full profiles as we calculate them only on sample size. To execute a sample profile, select the profile type as “Sample” and enter a sample profile size(count of records to be profiled).

    • Auto - For Auto profiling,
      • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (100) is less than the total Table Row Count (1000) then the sample profiling is performed by considering the count mentioned in the Sample Profile Size.
      • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (1000) is greater than the total Table Row Count (100) then the query is executed without considering the count mentioned in the Rowcount Limit. 

    Note:  A profile type set to “Auto” will always depend on the Rowcount limit when the row count constraint must be set as “True”. 

    • Query - For Query profiling,
      • If the entered table row count is less than the Rowcount Limit, then the profiling is executed on the entire table. 
      • If the input table row count exceeds the Rowcount Limit, then the profiling skips execution for those tables to avoid performance issues.
    • Disabled profile type prevents profiling on the selected data source

    Row Count Constraint

    The Row Count Constraint option is applicable only when the Profile Type is selected as Auto.

    • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (100) is less than the total Table Row Count (1000) then the sample profiling is performed by considering the count mentioned in the Sample Profile Size.

    If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (1000) is greater than the total Table Row Count (100) then the query is executed without considering the count mentioned in the Rowcount Limit. 

    Row Count Limit

    Enter the maximum number of rows that should be considered for profiling.

    Sample Data Count 

    Enter the total number of rows to see within the table data page in the Catalog.

    Sample Profile Size

    Select this option for considering the total number of rows in profiling.

    Query Timeout

    Enter the number of seconds for the query to run on the remote database before it times out.

    Note: Profiling will be successful if “All” and  “Current Day” are matched. 

    Remote Access

    The Remote Access (Data Access Authorization) is to ensure that the right user is accessing the query sheet and queries in the data catalog. Here, the system validates the user credentials and allows that user to access the query sheet and queries in the data catalog.

    • OvalEdge follows OvalEdge data permissions: When this option is selected, the OvalEdge application uses a service account used for creating the connection in the 
      • Query Sheet page
      • Data Catalog > Table > Data page and 
      • Data Catalog > Code page to fetch the required data.
    • OvalEdge follows Remote system permissions: When this option is selected, then the users are prompted to enter the remote data source credentials in the 
      • Query Sheet page, 
      • Data Catalog > Table > Data page and 
      • Data Catalog > Query page 

    The submitted credentials are used to establish the connection to fetch the required

    data. Here, these credentials will be cached and cleared after 7 days of inactivity.

    Query Policies

    The Query Policies in the Crawler setting is used to provide the right access to the Query sheet functions (Join, Union, SUM or aggregate functions). You can specify the desired roles and permission to deny the usage of the query sheet function. A role that has been denied policy permission will not have access to those functions in the Query Sheet.

    1. Click on the Query Policies tab, the data grid with existing Policies details with Role, Query Type, Access Type and Delete columns are displayed. 
    2. To include new query policies, click on the +Add Query Policy button.  
    3. Select the desired Role, Query Type and Access Type from the respective dropdowns and click on the Save Changes button
      Example: If the user selects the Role as “OE_HRADMIN”, Query Type as “JOIN” and the Access Type as “DENY”, then the users associated with the OE_HRADMIN privileges are restricted to use the JOIN function in the Query Sheet.

    Access Instruction

    It allows the admin user to write the instructions and guide the business user to crawl the data source. Instructions are nothing, but some related information about the data source connection can be provided here. It could have information resources like Link, Images or Videos that help business users of that particular data source connection. 

    For example: When an admin user saves the access instructions and crawls the schema, the saved instruction will appear at the data catalog after clicking on the schema. It's just providing info or guidelines on a specific schema.

    • You can provide the instruction in Crawler > Setting page 
    • Click the Access Instruction tab
    • Enter the instructions 
    Click the Save Changes button. Once you add the access instruction for a specific connection in the crawler settings, it will appear in the connection hierarchy like a database.

    Other


    The Send Metadata Changes Notifications option is provided in the Others tab of the settings page. The purpose is to send notifications on any metadata changes to data owners or stewards. 


    Send Metadata Notification to:

    1. Select whether the notifications for the Data Owner and Data Steward under the specific Roles need to be sent or not.
    2. Select the desired role from the Roles dropdown menu and click on the Save Changes button. The notifications will be successfully sent to select Data Owner and Data Steward.

    FAQs

    1.  What should I know when upgrading my driver?
    The Microsoft JDBC Driver 7.4 supports the JDBC 4.2, and 4.3 (partially) specifications and includes three JAR class libraries in the installation package as follows:

    JAR JDBC Specification JDK Version
    mssql-jdbc-7.4.1.jre8.jar JDBC 4.2 JDK 8.0


    2.  How much does the driver cost?

    The Microsoft JDBC Driver for SQL Server is available at no additional charge.

    3.  Can I use the driver to access Microsoft SQL Server from a Linux computer?
    Yes! You can use the driver to access SQL Server from Linux, Unix, and other non-Windows platforms. For having SQL Server JDBC Authentication on Linux environment, we use jtds 1.3.1 driver (NTLM).

    4.  Which authentication types are supported by the Microsoft JDBC Driver for SQL Server?
    The table below lists available authentication options.

    Platform Authentication
    Non-Windows SQL Server, NTLM, Azure Active Directory Authentication, Pure Java Kerberos 
    Windows SQL Server, Kerberos with NTLM backup, NTLM, Azure Active Directory Authentication, Pure Java Kerberos