MySQL Connector
The MySQL is a relational database management system (RDBMS) that consists of a multithreaded SQL server and supports different back ends, several different client programs and libraries, administrative tools, and a wide range of application programming interfaces (APIs).
In OvalEdge application, MySQL Connector allows users to crawl and profile the datasets like Tables, Columns, Views, Procedures, and Synonyms and helps in building the lineage.
Connector Capabilities
The connectivity to the MySQL Connector is established JDBC driver
Driver / API |
Version |
Description |
---|---|---|
JDBC |
8.0.25 |
https://mvnrepository.com/artifact/mysql/mysql-connector-java/5.1.29 |
Technical Specification
A technical specification for the MySQL connector contains information about the Crawler, Profiler, and Query Execution and also information about supported objects, supported data types, and user permissions.
Crawling
Features |
Supported Objects |
Supported Data Types |
Minimum User Permission |
---|---|---|---|
Crawling |
Tables, Views and Columns. |
Int, TinyInt, Smallint, Mediumint, Bigint, Float, Double, Decimal, Date, Datetime, Timestamp,Time, Varchar, Char, |
Select |
Functions, Procedures, and Synonyms |
View Definition |
Supported Objects | Supported Data Types |
Tables, Table Columns, Views, Stored Procedures, Functions, Roles, Users, Permissions, Usage Statistics | INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, DATETIME, TIMESTAMP, TIME, YEAR, CHAR, VARCHAR |
Profiling
Features |
Supported Objects |
Minimum User Permission |
---|---|---|
Profiling |
Tables, Views, and Columns |
Query Execution
Features |
Supported Objects |
Minimum User Permission |
---|---|---|
Query Execution |
No Object should be impacted. |
Select |
Lineage Building
Lineage Entities | Details |
Table lineage |
Supported |
Column lineage |
Supported |
Lineage Sources |
Stored procedures, functions, views, query logs |
Querying
Operation | Details |
Select |
Supported |
Insert |
Supported provided its service account have the permission |
Update |
Supported provided its service account have the permission |
Delete |
Supported provided its service account have the permission |
Joins within database |
Supported |
Joins outside database |
Not supported |
Aggregations |
Supported |
Group By |
Supported |
Order By |
Supported |
By default the service account provided for the connector will be used for any query operations. If the service account has write privileges, then Insert / Update / Delete queries can be executed.
Pre-requisites
To use the connector, the following need to be available:
- Connection details as specified in the following section should be available.
- An admin / service account with read privileges.
- JDBC driver is provided by default. In case it needs to be changed, add MySQL client drivers into the OvalEdge Jar path to communicate with the MySQL database.
Check the Configuration section for further details on how to add the drivers to the jar path.
Connection Details
Pre-requisites
To use the MySQL connector, the details specified in the following section should be available.
- By default the service account provided for the connector will be used for any query operations. If the service account has write privileges, then Insert/Update/Delete queries can be executed.
- The minimum privileges required are
Operation
Minimum Access Permission For Service Account
Connection Validation
SELECT, and USAGE
Crawling
Select, Usage , Reference and Execution
Profiling
No permission are required to profile
Query execution
Execution
Add Connection
To connect to the MySQL database using the OvalEdge application, complete the following steps.
- Login to the OvalEdge application. click on the Administration > Connectors module. The Connectors Information page is displayed.
- To add a new connection, click on the +AddNewConnector icon. A manage connection pop-up is displayed to select a connector. Search for, or select a MySQL connector. The manage connection form is displayed where information for setting up a connection is provided.
3. The fields in the connector form are explained below.
Field Name |
Description |
---|---|
Connection Type |
By default the selected connection type is displayed as MySQL. If required, the connection type can be changed and depending on the connector selected, fields are displayed accordingly. |
License Type |
Select the license type as Auto Lineage. License Type: In a license type, the permissions are specified based on the requirements of the customer. User has option to select the license type as Standard or Auto Lineage. The connector license is categorized into:
(i) Standard: The standard connectors may not have Auto Lineage functionality. It will not build the lineage for the selected database. (ii) Auto Lineage: Additionally, auto lineage connectors have Auto Lineage functionality. It will build the lineage for the selected database. See, License Types for more information. |
Connection Name |
The connection name specified in the Connection Name textbox will be a reference to the MySQL database connection in the OvalEdge application. Enter the name of the connection Example: MySQL_accountsdb |
Environment |
From the dropdown, select the desired environment, such as Production, QA, and Prod. Environment: The purpose of the environment field is to help users to understand that the new connector is established in an environment available at the development, production, and QA. |
Server |
Enter the fully qualified server name or IP address related to the MySQL database Example:oval-MySql.xxxyyyytosss2dxy.us-east-1.rds.amazonaws.com |
Port |
By default, port number 3306, related to the MySQL database, is displayed. If needed, a new port number can be provided. |
Database |
Enter the name of the database. It is a user-specified database name to crawl the metadata of selected connectors. |
Driver |
By default, the driver details associated with the MySQL database are displayed and it is non-editable. |
Username |
By default the username of the OvalEdge application is displayed. If required, users can manually enter the users credential of the OvalEdge application. |
Password |
By default, the OvalEdge application password is displayed. If required, users can manually enter the users credential of the OvalEdge application. |
Connection String |
A connection string is used to configure the MySQL connection object. MySQL connections are specified by key-value pairs in a connection string. Connection string has auto and manual mode. By default, it is set to Auto mode where connection string details are fetched automatically and displayed in the connection string textbox. Alternatively, the user can manually enter the string by setting the toggle button to manual mode. Format: jdbc:MySql://{server}:3306/{sid} Example: jdbc:MySql://oval.cvbnxxcvvwcraues.us-east-2.MySql.amazonaws.com:3306;database=customers |
Plug-in Server |
Enter the server name if you are running this as a plugin. |
Plug-in Port |
The port number on which the plugin is running. |
Default Governance Roles |
The admin will select a specific user or a team from the governance roles (Steward, Custodian, Owner) that get assigned to the data asset. The dropdown list displays all the configured users (single user or a team) as per the configurations made in the Security > Governance Roles page. |
No of archive objects |
By default, the number of archive objects is set to disable mode. Click on the Archive toggle button and enter the number of objects you wish to archive. No. of archive objects: It is the count of the number of last modifications made in the metadata data of a Remote/source. For example, if you update the count as 4 in the ‘No. of archive object’ field, and then the connection is crawled. It will provide the last 4 changes that occurred in the remote/source of the connector. You can observe these changes in the ‘version’ column of the ‘Metadata Changes’ module. |
Select Bridge |
The Bridge ID will be shown in the Bridge dropdown menu when bridges are configured and added, or it will be displayed as "NO BRIDGE". Bridge: To collect data from both on-premises and cloud sources, data administrators need to configure Cloud Bridge, by specifying a data source connection, and unique parameters. |
4. Click on the Save button to establish the connection. (OR)
5. Directly click on the Save & Configure button to configure and set up the connection. It will display the connection settings pop-up window to configure the connection.
Note: The Save & Configure button is displayed only for the Connectors for which the settings configuration is required.
6. Click on the Validate button once the entered connection details are validated.
Note: It is up to the user’s discretion whether to save the connection details first or validate the connection first and then save it.
Error Validation Details
Sl.No |
Error Message(s) |
Description |
1 |
Failed to establish a connection, Please check the credentials |
Invalid User Name, and in case of the wrong password. |
Connector Settings
Once the connection is established successfully, there are various settings provided to fetch and analyze the information from the data source. The connection setting includes Crawler, Profiler, Remote Access, Access Instruction, Lineage, and Others. Based on the connector selected the option in the setting page will differ.
To configure the connector settings, select the MySQL Connection Name from the Crawler Information page and click on the Nine dots option and select the Settings options. The Connection Settings pop-up window is displayed.
Crawlers
Crawler setting will help to get the information of the datasource based on the crawler setting provided. Crawler setting has various settings options including Crawler options, Crawler Rules, Query Log Setting.
Crawler Settings |
Descriptions |
---|---|
Crawler options |
|
Crawler Rules |
Include Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are included for crawling. |
Exclude Regex: Enter the specific schema, table, views, and column names that start with, end with, or have middle characters that are excluded for crawling. |
|
Query Log Settings |
A purpose of query log setting is to fetch the query from the datasource. You can configure different options under these settings. It includes : (i) Query: It will display the query fetched from the backend and make it available to the front end. It fetches (if any) list of queries that have undergone changes based on the criteria given in the look-back period. (ii) Look Back Period: Look back period feature allows the query box to save the queries for those specified days and allows you to fetch queries that were processed in the backend for that number of days ago. For example, the lookback period can be set to 1 day, 2 days, or more. The maximum Look back period is 7 days. (iii) Query Type: You can include query types such as Select, Insert, Update, and Delete (iv) Exclude Users: In the Query Logs, the administration has the option to exclude the users who are not required because If any user name is given in the excluded user, the query log of the executed query by that particular user is excluded from fetching. (v)Schema: Select the schema to apply query log settings. If you leave this blank, it will fetch the query log based on all the schema available for that database connection. |
Profiler
The process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the data source's quality before using it in an analysis. Profiling is always optional; crawling can be run without also profiling. However, if profiling is desired, there are a few options on how to profile the metadata/data. It includes:
Profiler options:
Profiler Options |
Details |
Tables and Columns |
Select the checkbox to profile the tables and columns that are existing in the MySQL database into the OvalEdge. Note: By default, the checkbox for Tables and Columns is selected. |
Views and Columns |
Select the checkbox to profile the views and columns that are existing in the MySQL database into the OvalEdge. Note: By default, the checkbox for Views and Columns is selected. |
Profiler Rule: The Profile rules provide the user with more refined results. The rules are written via basic regular expressions against the data source.
Profiler Rules |
Details |
Profile Rules: Include Regex |
Enter the specific table, view, and column names that start with, end with, or have middle characters that are included for profiling. |
Profile Rules: Exclude Regex |
Enter the specific table, view, and column names that start with, end with, or have middle characters that are excluded for profiling. |
Profiler Settings:
The attributes are as follows,
Parameters |
Description |
---|---|
Order |
Order number is a sequence in which the profiling is done. |
Day |
The day of a week in which profiling is set to run. |
Start/End Time |
The start and end time which profiling is set to perform. |
Number of Threads |
A thread is a process where a query is executed on a database to do single or multiple tasks. The number of threads determines the number of parallel queries that are executed on the data source. |
Profile Type |
There are four main types of data profiling.
The profiling is performed based on a given Sample Profile Size. The data on columns (like Min, Max, Distinct, Null Count, etc.) will be different compared to full profiles as we calculate them only on sample size. To execute a sample profile, select the profile type as “Sample” and enter a sample profile size(count of records to be profiled).
Note: A profile type set to “Auto” will always depend on the Rowcount limit when the row count constraint must be set as “True”.
|
Row Count Constraint |
The Row Count Constraint option is applicable only when the Profile Type is selected as Auto.
If the Row Constraint checkbox is selected (Set as True) and if the configured Rowcount Limit (1000) is greater than the total Table Row Count (100) then the query is executed without considering the count mentioned in the Rowcount Limit. |
Row Count Limit |
Enter the maximum number of rows that should be considered for profiling. |
Sample Data Count |
Enter the total number of rows to see within the table data page in the Catalog. |
Sample Profile Size |
Select this option for considering the total number of rows in profiling. |
Query Timeout |
Enter the number of seconds for the query to run on the remote database before it times out. |
Crawler/Profiler
Once the setting for profiling has been configured, go back to the Crawler screen and click “Crawl/Profile” to begin the profiling.
Remote Access
The Remote Access (Data Access Authorization) is to ensure that the right user is accessing the query sheet and queries in the data catalog. Here, the system validates the user credentials and allows that user to access the query sheet and queries in the data catalog.
- OvalEdge follows OvalEdge data permissions: When this option is selected, the OvalEdge application uses a service account used for creating the connection in the
- Query Sheet page
- Data Catalog > Table > Data page and
- Data Catalog > Code page to fetch the required data.
- OvalEdge follows Remote system permissions: When this option is selected, then the users are prompted to enter the remote data source credentials in the
- Query Sheet page,
- Data Catalog > Table > Data page and
- Data Catalog > Query page
The submitted credentials are used to establish the connection to fetch the required data. Here, these credentials will be cached and cleared after 7 days of inactivity.
Query Policies
The Query Policies in the Crawler setting is used to provide the right access to the Query sheet functions (Join, Union, SUM or aggregate functions). You can specify the desired roles and permission to deny the usage of the query sheet function. A role that has been denied policy permission will not have access to those functions in the Query Sheet.
- Click on the Query Policies tab, the data grid with existing Policies details with Role, Query Type, Access Type and Delete columns are displayed.
- To include new query policies, click on the +Add Query Policy button.
- Select the desired Role, Query Type, and Access Type from the respective dropdowns and click on the Save Changes button
Example: If the user selects the Role as “OE_HRADMIN”, Query Type as “JOIN” and the Access Type as “DENY”, then the users associated with the OE_HRADMIN privileges are restricted to use the JOIN function in the Query Sheet page.
Access Instruction
It allows the admin user to write the instructions and guide the business user to crawl the data source. Instructions are nothing, but some related information about the data source connection can be provided here. It could have the information resources like links, Images, or Videos that help business users of that particular data source connection.
For example: When an admin user saves the access instructions and crawls the schema, the saved instruction will appear in the data catalog after clicking on the schema. It's just providing info or guidelines on a specific schema.
- You can provide the instruction in Connectors > Setting page
- Click on the Access Instruction tab
- Enter the instructions
Other
The Send Metadata Changes Notifications option is provided in the Others tab of the settings page. The purpose is to send notifications on any metadata changes to data owners or stewards.
Send Metadata Notification to:
- Select whether the notifications for the Data Owner and Data Steward under the specific Roles need to be sent or not.
Additional Information
Parameters |
Description |
---|---|
Security |
This is regarding the technical user's SSL Certificate I AM Roles set in the backend. |
Proxy |
This connector supports the proxy configuration. |