Elasticsearch is an open-source, distributed search and analytics engine designed for horizontal scalability, meaning it can easily handle and process large amounts of data across multiple nodes or servers. It is part of the Elastic Stack, which also includes Logstash for data collection and processing, and Kibana for data visualization and exploration.
Elasticsearch is built on top of Apache Lucene, a full-text search engine library. It is commonly used for real-time search scenarios, where organizations need to quickly retrieve and analyze data from large volumes of unstructured or semi-structured data. Elasticsearch is often employed in various applications, such as log and event data analysis, text and document search, and business intelligence.
OvalEdge on the other hand, offers a user-friendly interface that enables connectivity to Elasticsearch. It integrates with Elasticsearch using Rest APIs and crawls the mapping templates.
Connector Capabilities
The connector capabilities are shown below:
Crawling
Feature |
Supported Objects |
---|---|
Crawling |
Index Mapping Template |
Profiling
Feature |
Supported Objects |
Remarks |
---|---|---|
Table Profiling |
Row Count, Columns Count, View Sample data |
- |
View Profiling |
Row Count, Columns Count, View sample data |
View is treated as a table for profiling purposes. |
Column Profiling |
Min, Max, Null Count, Distinct, Top 50 values |
- |
Full Profiling |
Supported |
- |
Sample Profiling |
Supported |
- |
Prerequisites
The following are prerequisites for connecting to the Elasticsearch On-Prem:
Drivers
The APIs/drivers used by the connector are given below:
S.No. |
Driver / API |
---|---|
1 |
Rest API |
Service Account Permissions
A service account is required for crawling. The minimum privileges required are:
Operation |
Access Permission |
---|---|
Connection validate |
Read |
Crawl objects |
Read |
Profile indexes |
Read |
Configuring Environment Variables
Configuring environment names enables you to select the appropriate environment from the drop-down list when adding a connector. This allows for consistent crawling of schemas across different environments, such as production (PROD), staging (STG), or temporary environments. It also facilitates schema comparisons and assists in application upgrades by providing a temporary environment that can be later deleted if needed.
Before establishing a connection, it is important to configure the environment names for the specific connector. If your environments have been configured, skip this step.
Steps to Configure the Environment
- Log into the OvalEdge application.
- Navigate to Administration > System Settings.
- Select the Connector tab.
- Find the key name “connector.environment”.
- Enter the desired environment values (PROD, STG) in the Value column.
- Click ✔ to Save.
Establish a Connection
To connect to the Elasticsearch On-Prem using the OvalEdge application, complete the following steps:
1. Log in to the OvalEdge application.2. Navigate to Administration > Connectors.
3. Click on the + (New Connector) icon, and the Add Connection with Search Connector pop-up window is displayed.
4. Add Connector pop-up window is displayed where you can search for the Elasticsearch On-Prem connector.
5. The Add Connector with Connector Type specific details pop-up window is displayed. Enter the relevant information to configure the Elasticsearch On-Prem connection.
Note: The asterisk (*) denotes mandatory fields required for establishing a connection.
Field Name |
Description |
---|---|
Connector Type |
By default, 'Elastic Search On Premise' is displayed as the selected connector type. If you want you can select the connector from the drop-down list. |
Credential Manager |
Select the option from the drop-down list, where you want to save your credentials. OE Credential Manager: The connection is configured with the basic Username and Password of the service account in real-time when OvalEdge establishes a connection to the Elasticsearch On-Prem. HashiCorp: The credentials are stored in the HashiCorp database server and fetched from HashiCorp to OvalEdge. AWS Secrets Manager: The credentials are stored in the AWS Secrets Manager database server and fetched from the AWS Secrets Manager to OvalEdge. AzureKeyVault: The credentials are stored in the Azure Key Vault database server and fetched from the Azure Key Vault to OvalEdge. Click here to know more. For more information on Credential Manager, refer to Credential Manager. |
License Add Ons |
All the connectors will have a Base Connector License by default that allows you to crawl and profile to obtain the metadata and statistical information from a data source. |
Connector Name* |
Provide a connector name for the Elasticsearch On-Prem data source in OvalEdge. This name will serve as a reference to identify the specific Elasticsearch On-Prem database connection. Example: "ESOnPrem_Connection_test" |
Connector Environment |
The Connector Environment drop-down list allows you to select the environment configured for the connector from the drop-down list. For example, PROD, or STG (based on the configured items in the OvalEdge configuration for the connector.environment). The purpose of the environment field is to help you identify which connector is connecting what type of system environment (Production, STG, or QA). Note: The steps to set up environment variables are explained in the Configuring Environment Variables section. |
Base URL* |
Specify the name of the database instance server IP/URL, which is accessible by the OvalEdge application. Format: http://{ip address}:{port} |
Username* |
Enter the Service Account Username of the Elasticsearch On-Prem. |
Password* |
Enter the password of the Elasticsearch On-Prem data source. |
Steward* |
Select the Steward from the drop-down list options. |
Custodian* |
Select the Custodian from the drop-down list options. |
Owner* |
Select the Owner from the drop-down list options. |
Governance Roles 4, 5, 6* |
Select the respective user from the drop-down options. Note: The drop-down list displays all the configurable roles (single user or a team) as per the configurations made in the OvalEdge Security > Governance Roles section. |
Integration Admins* |
To add Integration Admin Roles, search for or select one or more roles from the Integration Admin options, and then click on the Apply button. |
Security and Governance Admins* |
To add Security and Governance Admin roles, search for or select one or more roles from the list, and then click on the Apply button.
|
No. of Archived Objects* |
The number of archive objects indicates the number of recent metadata modifications made to a dataset at a remote/source location. By default, the archive objects feature is deactivated. However, users may enable it by clicking the Archive toggle button and specifying the number of objects they wish to archive. |
Select Bridge |
With the OvalEdge Bridge component, any cloud-hosted server can connect with any on-premise or public cloud data sources without modifying firewall rules. A bridge provides real-time control that makes it easy to manage data movement between any source and destination. For more information, refer to Bridge Overview. For more information, refer to Bridge Overview |
6. After filling in all the connection details, select the appropriate button based on your preferences.
-
- Validate: Click on the Validate button to verify the connection details. This ensures that the provided information is accurate and enables successful connection establishment.
- Save: Click on the Save button to store the connection details. Once saved, the connection will be added to the Connectors home page for easy access.
- Save & Configure: For certain Connectors that require additional configuration settings. Click on the Save & Configure button. This will open the Connection Settings pop-up window, allowing you to configure the necessary settings before saving the connection.
7. Once the connection is validated and saved, it will be displayed on the Connectors home page.
Note: You can either save the connection details first, or you can validate the connection first and then save it.
Connection Validation Details
S.No |
Error Message(s) |
Description |
1 |
Failed to establish a connection, please check the credentials. |
In case of an invalid username and password. |
Note: If you have any issues creating a connection, please contact your assigned OvalEdge Customer Success Management (CSM) team.
Connector Settings
Once the connection is established successfully, various settings are provided to fetch and analyze the information from the data source. The connection settings include Crawler, Profiler, Access Instruction, Business Glossary Settings, and Notification.
To view the Connector Settings page,
- Go to the Connectors page.
- From the nine dots select the Settings option.
- The Connector Settings page is displayed where you can view all the connector setting options.
- Click on Save Changes. All the settings will be applied to the metadata.
Connection Settings |
Description |
Crawler |
Crawler settings are configured to connect to a data source and collect and catalog all the data elements in the form of metadata. |
Profiler |
The process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the quality of data sources before using them for analysis. Profiling is always optional; crawling can be run without profiling. |
Access Instruction |
Access Instruction allows the data owner to instruct others on using the objects in the application. |
Business Glossary Settings |
The Business Glossary Setting provides flexibility and control over how they view and manage term association within the context of a business glossary at the connector level. |
Notification |
The Enable/Disable Metadata Change Notifications option is used to set the change notification about the metadata changes of the data objects.
|
Note: For more information on Connector settings, check the Connector Settings article.
Crawling of Schema(s)
A Crawl/Profile option allows you to select the specific schemas for the following operations: Crawl, Crawl & Profile, Profile, or Profile Unprofiled. For any scheduled crawlers and profilers, the defined run date and time are displayed to set.
1. Navigate to the Connectors page, and click on the Crawl/Profile button.Select Schema For Crawling and Profiling pop-up window is displayed.
Note: By default the schema name is shown as Default schema for every Elasticsearch on-prem connector.
-
- Crawl: It allows the crawling of the metadata of the selected schemas.
- Crawl & Profile: It allows crawling the metadata of the selected schemas and profiles the sample data.
- Profile: It allows the collection of table column statistics.
- Profile Unprofiled: It allows the profiling of data that has not been profiled.
Schedule: Connectors can also be scheduled for crawling and/or profiling in advance to run at prescribed times and selected intervals.
Note: For more information on Scheduling, refer to Scheduling Connector.