No SQL

Azure Cosmos DB

The Azure Cosmos DB is a globally distributed cloud-based NoSQL database that allows users to manage data across data centers worldwide. The  Cosmos DB Instance Connector is used to pull the metadata existing in the  Cosmos DB database and helps the users crawl the metadata, and profile the sample data.


OvalEdge uses Azure SDK to connect to the data source, which allows users to crawl and profile the data objects (Tables, Table Columns, etc.).


Connector Capabilities

The following is the list of objects and data types the Azure Cosmos DB connector supports.

Functionality

Support Data Objects

Crawler 

  • Tables 
  • Triggers
  • Functions 
  • Procedures
  • Columns

Profiler

  • Table Profiling: Row count, Column count, and View sample data
  • Column Profiling: Min, Max, Null count, distinct, top 50 values

Prerequisites

The following are the prerequisites required to establish a connection between the connector and the OvalEdge application. 

Drivers  


Driver

Version

Descriptions

Azure SDK of Cosmos DB

4.x

SDK given by Azure to communicate with Cosmos DB.

User Permissions

The minimum privileges required for a service account user to crawl and profile data are as follows:

Operations

Access Permission

Connection Validation

Read

Crawl schemas

Read

Crawl tables

Read

Profile schemas, tables

Read

Configure Environment Variables

Configuring environment names enables you to select the appropriate environment from the drop-down list when adding a connector. This allows for consistent crawling of schemas across different environments, such as production (PROD), staging (STG), or temporary environments. It also facilitates schema comparisons and assists in application upgrades by providing a temporary environment that can later be deleted.

Before establishing a connection, it is important to configure the environment names for the specific connector. If your environments have been configured, skip this step. 

Steps to Configure the Environment

  1. Log into the OvalEdge application.
  2. Navigate to AdministrationSystem Settings.
  3. Select the Connector tab.
  4. Find the key name “connector.environment”.
  5. Enter the desired environment values (PROD, STG) in the Value column.
  6. Click ✔ to Save.

Establish a Connection

To connect to the Azure Cosmos DB using the OvalEdge application, complete the following steps:

  1. Log in to the OvalEdge application.
  2. Navigate to Administration >  Connectors.
  3. Click on the + (New Connector) icon.
  4. The Add Connector pop-up window is displayed, and you can search for the Azure Cosmos DB connector.
  5. The Add Connector with Connector Type specific details pop-up window is displayed. Enter the relevant information to configure the Azure Cosmos DB connection.
    Note: The asterisk (*) denotes mandatory fields for establishing a connection.

    Field Name

    Description

    Connector Type

    It allows you to select the connector from the drop-down list. By default, ‘Azure Cosmos DB’ is displayed as the selected connector type.

    Credential Manager*

    Select the option from the drop-down list to indicate where you want to save your credentials:

    OE Credential Manager: Oracle connection is configured with the basic Username and Password of the service account in real-time when OvalEdge establishes a connection to the Oracle database. Users must manually add the credentials if the OE Credential Manager option is selected.

    HashiCorp: The credentials are stored in the HashiCorp database server and fetched from HashiCorp to OvalEdge.  

    AWS Secrets Manager: The credentials are stored in the AWS Secrets Manager database server; OvalEdge fetches the credentials from the AWS Secrets Manager. 

    Azure Key Vault: Azure Key Vault allows for secure storage and strict access mechanisms of sensitive information such as tokens, passwords, certificates, API keys, and other confidential data.

    For more information on Azure Key Vault, click here.

    For more information on Credential Manager, click here.

    License Add Ons

    All the connectors will have a Base Connector License by default that allows you to crawl and profile to obtain the metadata and statistical information from a data source. 

    OvalEdge supports various License Add-Ons based on the connector’s functionality requirements.

    • Data Quality: Select the Data Quality Add-On license to identify, report, and resolve the data quality issues for a connector whose data supports data quality (DQ), using DQ rules/functions, anomaly detection, Reports, and more.

    Connector Name*

    Select a Connection name for the Cosmos DB database. The name you specify is a reference for your Cosmos DB database connection in OvalEdge. Example: Cosmos DB Connection DB1

    Connector Environment

    The Connector Environment drop-down list allows you to select the environment configured for the connector from the drop-down list. 

    For example, PROD, or STG (based on the configured items in the OvalEdge configuration for the connector.environment).

    The purpose of the environment field is to help you identify which connector is connecting what type of system environment (Production, STG, or QA).

    Note: The Configuring Environment Variables section explains the steps to set up environment variables.

    Cosmos DB Connection String*

    Enter the Azure Cosmos DB Connection string.

    Plugin Server

    Enter the server name if you are running this as a plugin.

    Plugin Port

    The port number on which the plugin is running.

    Default Governance Roles

    Steward*

    Select the Steward from the drop-down list options.

    Custodian*

    Select the Custodian from the drop-down list options.

    Owner*

    Select the Owner from the drop-down list options.

    Governance Roles 4, 5, 6*

    Select the respective user from the drop-down options.

    Note: The drop-down list displays all the configurable roles (single user or a team) as per the configurations made in the OvalEdge Security > Governance Roles section.

    Admin Roles

    Integration Admins*

    To add Integration Admin Roles, search for or select one or more roles from the Integration Admin options and then click on the Apply button.
    The responsibility of the Integration Admin includes configuring crawling and profiling settings for the connector, as well as deleting connectors, schemas, or data objects.

    Security and Governance Admins*

    To add Security and Governance Admin roles, search for or select one or more roles from the list and then click on the Apply button.
    The security and Governance Admin is responsible for:

    • Configure role permissions for the connector and its associated data objects.
    • Add admins to set permissions for the connector's roles and associated data objects.
    • Update governance roles.
    • Create custom fields.
    • Develop Service Request templates for the connector.
    • Create Approval workflows for the templates.

    No of the archive objects*

    The number of archive objects indicates the number of recent metadata modifications made to a dataset at a remote/source location. The archive objects feature is deactivated by default. However, users may enable it by clicking the Archive toggle button and specifying the number of objects they wish to archive. 

    Select Bridge*

    With the OvalEdge Bridge component, any cloud-hosted server can connect with any on-premise or public cloud data sources without modifying firewall rules. A bridge provides real-time control that makes managing data movement between any source and destination easy. 

    For more information, refer to Bridge Overview

  6. After entering all the connection details, select the appropriate button based on your preferences.
    1. Validate: Click on the Validate button to verify the connection details. This ensures that the provided information is accurate and enables successful connection establishment.
    2. Save: Click on the Save button to store the connection details. Once saved, the connection will be added to the Connectors home page for easy access.
    3. Save & Configure: For certain connectors that require additional configuration settings, click the Save & Configure button. This will open the Connection Settings pop-up window, allowing you to configure the necessary settings before saving the connection.
  7. Once the connection is validated and saved, it will be displayed on the Connectors home page.

Note: You can either save the connection details first or validate the connection first and then save it.

Error Validation Details 

The following are the possible error messages encountered during the validation. 


Sl.No

Error Messages

Description

1

error_validate_connection 

An alert message is displayed when the provided details are incorrect.

Note: If you have issues creating a connection, please contact your assigned OvalEdge Customer Success Management (CSM) team.

Connector Settings 

Once the connection is established successfully, various settings are provided to fetch and analyze the information from the data source.  

The connection settings include Crawler, Profiler, Access Instruction, Business Glossary Settings, and Others.

To view the Connector Settings page,

  1. Go to the Connectors page.
  2. From the 9- dots, select the Settings option.
  3. The connector Settings page is displayed, where you can view all the connector setting options.
  4. Click on Save Changes. All the settings will be applied to the metadata.

The following is a list of connection settings along with their corresponding descriptions:

Connection Settings

Description

Crawler

Crawler settings are configured to connect to a data source and collect and catalog all the data elements in the form of metadata.

Profiler




The process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the quality of data sources before using them for analysis. Profiling is always optional; crawling can be run without profiling. 

Access Instruction

Access Instruction allows the data owner to instruct others on using the objects in the application.

Business Glossary Setting

The Business Glossary Setting provides flexibility and control over how they view and manage term association within the context of a business glossary at the connector level.

Others

The Enable/Disable Metadata Change Notifications option sets the change notification about the metadata changes of the data objects.

  • You can use the toggle button to set the Default Governance Roles (Steward, Owner Custodian, etc.) 

Using the Roles and Teams, you can select the role and team to receive the notification of metadata changes.

Note: For more information, refer to the Connector Settings.

Crawling of Schema(s)

A Crawl/Profile option allows you to select the specific schemas for the following operations: Crawl, Crawl & Profile, Profile, or Profile Unprofiled. The defined run date and time are displayed to set for any scheduled crawlers and profilers.

  1. Navigate to the Connectors page and click on the Crawl/Profile button.
    Select Important Schema For Crawling and Profiling pop-up window is displayed.
  2. Select the required Schema(s).
  3. The list of actions below is displayed in the Action section.
    1. Crawl: It allows the crawling of the metadata of the selected schemas.
    2. Crawl & Profile: It allows crawling the metadata of the selected schemas and profiles the sample data.
    3. Profile: It allows the collection of table column statistics.
    4. Profile Unprofiled: It allows the profiling of data that has not been profiled to be profiled.
    5. Schedule: Connectors can also be scheduled in advance to run crawling and/or profiling at prescribed times and selected intervals.
      Note: For more information on Scheduling, refer to Scheduling Connector.

Click on the Run button, which gathers all metadata from the connected source and puts it into the OvalEdge Data Catalog.

Additional Information


Question: Is Azure Cosmos DB also available On-premise?

Answer: Azure Cosmos DB is a cloud solution and, therefore, does not have any on-premise software.