MySQL

MySQL Connector

The MySQL is a relational database management system (RDBMS) that consists of a multithreaded SQL server and supports different back ends, several different client programs and libraries, administrative tools, and a wide range of application programming interfaces (APIs). 

In OvalEdge application, MySQL Connector allows users to crawl and profile the datasets like Tables, Columns, Views, Procedures, and Synonyms and helps in building the lineage.

mySQLdigram-1

Connector Capabilities

The connectivity to the MySQL Connector is established JDBC driver 

Driver / API 

Version

Description

JDBC 

8.0.25

https://mvnrepository.com/artifact/mysql/mysql-connector-java/5.1.29

Technical Specification

A technical specification for the MySQL connector contains information about the Crawler, Profiler, and Query Execution and also information about supported objects, supported data types, and user permissions.

Crawling

Features

Supported Objects

Supported Data Types 

Minimum User Permission

Crawling

Tables, Views and Columns. 

Int, TinyInt, Smallint, Mediumint, Bigint, Float, Double, Decimal, Date, Datetime, Timestamp,Time, Varchar, Char, 

Select

Functions, Procedures, and Synonyms

 

View Definition

Supported Objects Supported Data Types
Tables, Table Columns, Views, Stored Procedures, Functions, Roles, Users, Permissions, Usage Statistics INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, DATETIME, TIMESTAMP, TIME, YEAR, CHAR, VARCHAR

Profiling

Features

Supported Objects

Minimum User  Permission

Profiling

Tables, Views, and Columns

 

Query Execution 

Features

Supported Objects

Minimum User Permission

Query Execution

No Object should be impacted. 

Select

Lineage Building

Lineage Entities Details

Table lineage

Supported

Column lineage

Supported

Lineage Sources

Stored procedures, functions, views, query logs

Querying

Operation Details

Select

Supported

Insert

Supported provided its service account have the permission

Update

Supported provided its service account have the permission

Delete

Supported provided its service account have the permission

Joins within database

Supported

Joins outside database

Not supported

Aggregations

Supported

Group By

Supported

Order By

Supported

By default the service account provided for the connector will be used for any query operations. If the service account has write privileges, then Insert / Update / Delete queries can be executed.

Pre-requisites

To use the connector, the following need to be available:

  • Connection details as specified in the following section should be available.
  • An admin / service account with read privileges.
  • JDBC driver is provided by default. In case it needs to be changed, add MySQL client drivers into the OvalEdge Jar path to communicate with the MySQL database. 

Check the Configuration section for further details on how to add the drivers to the jar path.

Connection Details

Pre-requisites

To use the MySQL connector, the details specified in the following section should be available.    

  •   By default the service account provided for the connector will be used for any query operations.   If the service account has write privileges, then Insert/Update/Delete queries can be executed.           
  • The minimum privileges required are

    Operation 

    Minimum Access Permission For Service Account

    Connection Validation

    SELECT, and USAGE

    Crawling

    Select, Usage , Reference and Execution 

    Profiling 

    No permission are required to profile

    Query execution

    Execution

Add Connection

To connect to the MySQL database using the OvalEdge application, complete the following steps.

  1. Login to the OvalEdge application. click on the Administration > Connectors module. The Connectors Information page is displayed.
  2. To add a new connection, click on the +AddNewConnector icon. A manage connection pop-up is displayed to select a connector. Search for, or select a MySQL connector. The manage connection form is displayed where information for setting up a connection is provided. 

    mysql-connector-4

   3. The fields in the connector form are explained below.  

Field Name

Description

Connection Type

By default the selected connection type is displayed as MySQL. If required, the connection type can be changed and depending on the connector selected, fields are displayed accordingly.

License Type

Select the license type as Auto Lineage.


License Type: 

In a license type, the permissions are specified based on the requirements of the customer. User has option to select the license type as Standard or Auto Lineage.  The connector license is categorized into:

 

(i) Standard: The standard connectors may not have Auto Lineage functionality. It will not build the lineage for the selected database.


(ii) Auto Lineage: Additionally, auto lineage connectors have Auto Lineage functionality. It will build the lineage for the selected database.


See, License Types for more information. 

Connection Name

The connection name specified in the Connection Name textbox will be a reference to the MySQL database connection in the OvalEdge application.


Enter the name of the connection

Example: MySQL_accountsdb

Environment

From the dropdown, select the desired environment, such as Production, QA, and Prod. 


Environment: The purpose of the environment field is to help users to understand that the new connector is established in an environment available at the development, production, and QA.

Server

Enter the fully qualified server name or IP address related to the MySQL database 


Example:oval-MySql.xxxyyyytosss2dxy.us-east-1.rds.amazonaws.com

Port

By default, port number 3306, related to the MySQL database, is displayed. If needed, a new port number can be provided.

Database

Enter the name of the database.


It is a user-specified database name to crawl the metadata of selected connectors.

Driver

By default, the driver details associated with the MySQL database are displayed and it is non-editable.

Username

By default the username of the OvalEdge application is displayed. If required, users can manually enter the users credential of the OvalEdge application. 

Password

By default, the OvalEdge application password is displayed.  If required, users can manually enter the users credential of the OvalEdge application. 

Connection String

A connection string is used to configure the MySQL connection object. MySQL connections are specified by key-value pairs in a connection string.


Connection string has auto and manual mode. 

By default, it is set to Auto mode where connection string details are fetched automatically and displayed in the connection string textbox. Alternatively, the user can manually enter the string by setting the toggle button to manual mode. 


Format: jdbc:MySql://{server}:3306/{sid}

Example: jdbc:MySql://oval.cvbnxxcvvwcraues.us-east-2.MySql.amazonaws.com:3306;database=customers

Plug-in Server

Enter the server name if you are running this as a plugin.

Plug-in Port

The port number on which the plugin is running.

Default Governance Roles

The admin will select a specific user or a team from the governance roles (Steward, Custodian, Owner) that get assigned to the data asset. 


The dropdown list displays all the configured users (single user or a team) as per the configurations made in the Security > Governance Roles page.  

No of archive objects

By default, the number of archive objects is set to disable mode. Click on the Archive toggle button and enter the number of objects you wish to archive.


No. of archive objects:  It is the count of the number of last modifications made in the metadata data of a Remote/source. 


For example, if you update the count as 4 in the ‘No. of archive object’ field, and then the connection is crawled. It will provide the last 4 changes that occurred in the remote/source of the connector. You can observe these changes in the ‘version’ column of the ‘Metadata Changes’ module. 

Select Bridge

The Bridge ID will be shown in the Bridge dropdown menu when bridges are configured and added, or it will be displayed as "NO BRIDGE". 


Bridge: To collect data from both on-premises and cloud sources, data administrators need to configure Cloud Bridge, by specifying a data source connection, and unique parameters.

   4. Click on the Save button to establish the connection. (OR)

   5. Directly click on the Save & Configure button to configure and set up the connection. It will display the connection settings pop-up window to configure the connection.

Note: The Save & Configure button is displayed only for the Connectors for which the settings configuration is required.

    6. Click on the Validate button once the entered connection details are validated. 

    Note: It is up to the user’s discretion whether to save the connection details first or validate the connection first and then save it. 

     Error Validation Details

    Sl.No

    Error Message(s)

    Description

    1

    Failed to establish a connection, Please check the credentials

    Invalid User Name, and in case of the wrong password.

    Connector Settings

    Once the connection is established successfully, there are various settings provided to fetch and analyze the information from the data source. The connection setting includes Crawler, Profiler, Remote Access, Access Instruction, Lineage, and Others. Based on the connector selected the option in the setting page will differ. 

    To configure the connector settings, select the MySQL Connection Name from the Crawler Information page and click on the Nine dots option and select the Settings options. The Connection Settings pop-up window is displayed.

    Crawlers

    Crawler setting will help to get the information of the datasource based on the crawler setting provided. Crawler setting has various settings options including Crawler options, Crawler Rules, Query Log Setting.   

    Profiler

    Crawler Settings 

    Descriptions

    Crawler options

    • Tables, Views, and Columns: This crawling will discover the tables, views, and Columns from MySQL database and bring them into OvalEdge. This is the default option for crawling.
    • Procedures, Functions & Views Source Code: The crawling will discover the procedures, functions, views source code within the schemas selected, and bring them into OvalEdge. 
    • Relationship: This crawl job will look at the schemas selected and determine relationships between the tables and bring this information into OvalEdge.

    Crawler Rules

    Include Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are included for crawling.

    Exclude Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are excluded for crawling.

    Query Log Settings

    A purpose of query log setting is to fetch the query from the datasource. You can configure different options under these settings. It includes :

    (i) Query:

     It will display the query fetched from the backend and make it available to the front end. It fetches (if any) list of queries that have undergone changes based on the criteria given in the look-back period. 

    (ii) Look Back Period: Look back period feature allows the query box to save the queries for those specified days and allows you to fetch queries that were processed in the backend for that number of days ago. For example, the lookback period can be set to 1 day, 2 days, or more. The maximum Look back period is 7 days.

    (iii) Query Type: You can include query types such as Select, Insert, Update, and Delete

    (iv) Exclude Users: In the Query Logs, the administration has the option to exclude the users who are not required because If any user name is given in the excluded user, the query log of the executed query by that particular user is excluded from fetching. 

    (v)Schema: Select the schema to apply query log settings. If you leave this blank, it will fetch the query log based on all the schema available for that database connection.

    Profiler

    The process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the data source's quality before using it in an analysis. Profiling is always optional; crawling can be run without also profiling. However, if profiling is desired, there are a few options on how to profile the metadata/data. It includes: 

    Profiler options:

    Profiler Options

    Details

    Tables and Columns

    Select the checkbox to profile the tables and columns that are existing in the MySQL database into the OvalEdge.


    Note: By default, the checkbox for Tables and Columns is selected.

    Views and Columns

    Select the checkbox to profile the views and columns that are existing in the MySQL database into the OvalEdge.


    Note: By default, the checkbox for Views and Columns is selected.

    Profiler Rule: The Profile rules provide the user with more refined results. The rules are written via basic regular expressions against the data source. 

    Profiler Rules

    Details

    Profile Rules: Include Regex

    Enter the specific table, view, and column names that start with, end with, or have middle characters that are included for profiling.

    Profile Rules: Exclude Regex

    Enter the specific table, view, and column names that start with, end with, or have middle characters that are excluded for profiling.

    Profiler Settings:

    The attributes are as follows, 

    Parameters

    Description

    Order

    Order number is a sequence in which the profiling is done. 

    Day

    The day of a week in which profiling is set to run.

    Start/End Time

    The start and end time which profiling is set to perform.

    Number of Threads

    A thread is a process where a query is executed on a database to do single or multiple tasks. The number of threads determines the number of parallel queries that are executed on the data source. 

    Profile Type

    There are four main types of data profiling. 


    • Sample - For Sample profiling, 

    The profiling is performed based on a given Sample Profile Size. The data on columns (like Min, Max, Distinct, Null Count, etc.) will be different compared to full profiles as we calculate them only on sample size. To execute a sample profile, select the profile type as “Sample” and enter a sample profile size(count of records to be profiled).

    • Auto - For Auto profiling,
      • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (100) is less than the total Table Row Count (1000) then the sample profiling is performed by considering the count mentioned in the Sample Profile Size.
      • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (1000) is greater than the total Table Row Count (100) then the query is executed without considering the count mentioned in the Rowcount Limit. 

    Note:  A profile type set to “Auto” will always depend on the Rowcount limit when the row count constraint must be set as “True”. 

    • Query - For Query profiling,
      • If the entered table row count is less than the Rowcount Limit, then the profiling is executed on the entire table. 
      • If the input table row count exceeds the Rowcount Limit, then the profiling skips execution for those tables to avoid performance issues.
    • Disabled profile type prevents profiling on the selected data source

    Row Count Constraint

    The Row Count Constraint option is applicable only when the Profile Type is selected as Auto.

    • If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (100) is less than the total Table Row Count (1000) then the sample profiling is performed by considering the count mentioned in the Sample Profile Size.

    If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (1000) is greater than the total Table Row Count (100) then the query is executed without considering the count mentioned in the Rowcount Limit. 

    Row Count Limit

    Enter the maximum number of rows that should be considered for profiling.

    Sample Data Count 

    Enter the total number of rows to see within the table data page in the Catalog.

    Sample Profile Size

    Select this option for considering the total number of rows in profiling.

    Query Timeout

    Enter the number of seconds for the query to run on the remote database before it times out.

    Crawler/Profiler

    Once the setting for profiling has been configured, go back to the Crawler screen and click “Crawl/Profile” to begin the profiling. 

    Remote Access

    The Remote Access (Data Access Authorization) is to ensure that the right user is accessing the query sheet and queries in the data catalog. Here, the system validates the user credentials and allows that user to access the query sheet and queries in the data catalog.

    • OvalEdge follows OvalEdge data permissions: When this option is selected, the OvalEdge application uses a service account used for creating the connection in the 
      • Query Sheet page
      • Data Catalog > Table > Data page and 
      • Data Catalog > Code page to fetch the required data.
    • OvalEdge follows Remote system permissions: When this option is selected, then the users are prompted to enter the remote data source credentials in the 
      • Query Sheet page, 
      • Data Catalog > Table > Data page and 
      • Data Catalog > Query page 

    The submitted credentials are used to establish the connection to fetch the required data. Here, these credentials will be cached and cleared after 7 days of inactivity.

    Query Policies

    The Query Policies in the Crawler setting is used to provide the right access to the Query sheet functions (Join, Union, SUM or aggregate functions). You can specify the desired roles and permission to deny the usage of the query sheet function. A role that has been denied policy permission will not have access to those functions in the Query Sheet.

    1. Click on the Query Policies tab, the data grid with existing Policies details with Role, Query Type, Access Type and Delete columns are displayed. 
    2. To include new query policies, click on the +Add Query Policy button.  
    3. Select the desired Role, Query Type, and Access Type from the respective dropdowns and click on the Save Changes button

    Example: If the user selects the Role as “OE_HRADMIN”, Query Type as “JOIN” and the Access Type as “DENY”, then the users associated with the OE_HRADMIN privileges are restricted to use the JOIN function in the Query Sheet page.

    QueryPolicy

    Access Instruction

    It allows the  admin user to write the instructions and guide the business user to crawl the data source. Instructions are nothing, but some related information about the data source connection can be provided here. It could have the information resources like links, Images, or Videos that help business users of that particular data source connection. 

    For example: When an admin user saves the access instructions and crawls the schema, the saved instruction will appear in the data catalog after clicking on the schema. It's just providing info or guidelines on a specific schema.

    • You can provide the instruction in Connectors > Setting page 
    • Click on the Access Instruction tab
    • Enter the instructions 
    Click the Save Changes button. Once you add the access instruction for a specific connection in the crawler settings, it will appear in the connection hierarchy like a database.

    Other

    The Send Metadata Changes Notifications option is provided in the Others tab of the settings page. The purpose is to send notifications on any metadata changes to data owners or stewards. 

    Other-1

    Send Metadata Notification to:

    1. Select whether the notifications for the Data Owner and Data Steward under the specific Roles need to be sent or not.
    Select the desired role from the Roles dropdown menu and click on the Save Changes button. The notifications will be successfully sent to select Data Owner and Data Steward.

    Additional Information

     

    Parameters

    Description

    Security

    This is regarding the technical user's SSL Certificate I AM Roles set in the backend. 

    Proxy

    This connector supports the proxy configuration.