This document outlines the integration with the Amazon S3 connector, enabling streamlined metadata management through features such as crawling and data preview. Additionally, it ensures secure authentication via Credential Manager.
Overview
Connector Details
Connector Category |
Cloud Storage |
OvalEdge Release Current Connector Version |
6.3.4 |
Connectivity [How OvalEdge connects to Amazon S3] |
AWS S3 SDK |
OvalEdge Releases Supported (Available from) |
Release4.0 |
Connector Features
Feature | Availability |
---|---|
Crawling / Cataloging | ✅ |
Delta Crawling | ❌ |
Profiling | ✅ |
Query Sheet | NA |
Data Preview | ✅ |
Auto Lineage | NA |
Manual Lineage | ✅ |
Secure Authentication via Credential Manager | ✅ |
Data Quality | ✅ |
DAM (Data Access Management) | ✅ |
Bridge | ✅ |
Metadata Mapping
The following objects are crawled from Amazon S3 and mapped to the corresponding UI assets.
Amazon S3 Object | Amazon S3 Attribute | OvalEdge Attribute | OvalEdge Category | OvalEdge Type |
---|---|---|---|---|
Bucket | Bucket | Bucket | Bucket | Bucket |
Folder | Folder | Folder | Folder | Folder |
File | File | File | File | File |
XLSX | File | File | File | XLSX |
XLS | File | File | File | XLS |
CSV | File | File | File | CSV |
TXT | File | File | File | TXT |
PARQUET | File | File | File | PARQUET |
ORC | File | File | File | ORC |
JSON | File | File | File | JSON |
YAML | File | File | File | YAML |
Set up a Connection
Prerequisites
The following are the prerequisites to establish a connection:
Service Account User Permissions
Important: It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.
Note: 👨💻Who can provide these permissions? These permissions are typically granted by the Amazon S3 administrator, as users may not have the required access to assign them independently.
Objects |
Access Permission |
Buckets |
ListAllMyBuckets GetBucketLocation GetBucketTagging GetEncryptionConfiguration |
Folder |
ListBucket GetBucketLocation GetEncryptionConfiguration |
Files |
ListBucket GetBucketLocation GetEncryptionConfiguration |
Profile |
GetObject |
Connection Configuration Steps
Important: Users are required to have the Connector Creator role in order to configure a new connection.
- Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for Amazon S3, and complete the required parameters.
Note: Fields marked with an asterisk (*) are mandatory for establishing a connection.
Field Name |
Description |
||||||
Connector Type |
By default, "Amazon S3" is displayed as the selected connector type. |
||||||
Connector Settings |
|||||||
Authentication* |
The following two types of authentication are supported for Amazon S3:
|
||||||
Credential Manager* |
Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on your selection. Supported Credential Managers:
|
||||||
License Add Ons |
|
||||||
Connector Name* |
Enter a unique name for the Amazon S3 connection (Example: "AmazonS3db"). |
||||||
Connector Environment |
Select the environment (Example: PROD, STG) configured for the connector. |
||||||
Access key* |
Enter the AWS Access Key ID used to authenticate your IAM user. Note: This field is available only when the Authentication is selected as "IAM User Authentication". |
||||||
Secret key* |
Enter the AWS Secret Access Key associated with the Access Key ID. Note: This field is available only when the Authentication is selected as "IAM User Authentication". |
||||||
Cross-Account Role ARN |
Enter the ARN (Amazon Resource Name) of the role used for cross-account access. Note: This field is available only when the Authentication is selected as "Role Based Authentication". |
||||||
Filter by tags |
Enter one or more tags to narrow down and display only the items associated with those tags. |
||||||
Region |
Enter the region where your Amazon S3 files or resources are located. |
Default Governance Roles |
|
Default Governance Roles* |
Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection. |
Admin Roles |
|
Admin Roles* |
Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection. |
No of Archive Objects |
|
No Of Archive Objects* |
This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive. Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module. |
Bridge |
|
Select Bridge* |
If applicable, select the bridge from the drop-down list. The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules. |
- After entering all connection details, the following actions can be performed:
- Click Validate to verify the connection.
- Click Save to store the connection for future use.
- Click Save & Configure to apply additional settings before saving.
- The saved connection will appear on the Connectors home page.
Manage Connector Operations
Crawl/Profile
Important: To perform crawl and profile operations, users must be assigned the Integration Admin role.
The Crawl/Profile button allows users to select one or more schemas for crawling and profiling.
- Navigate to the Connectors page and click Crawl/Profile.
- Select the schemas to be crawled.
- The Crawl option is selected by default. To perform both operations, select the Crawl & Profile radio button.
- Click Run to collect metadata from the connected source and load it into the Data Catalog.
- After a successful crawl, the information appears in the Data Catalog > Databases/Files/File Columns tab.
The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.
- Click the Schedule checkbox to enable the Select Period drop-down.
- Select a time interval for the operation from the drop-down menu.
- Click Schedule to initiate metadata collection from the connected source.
- The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.
Other Operations
The Connectors page provides a centralized view of all configured connectors, along with their health status.
Managing connectors includes:
- Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.
- Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.
Nine Dots Menu Options:
To view, edit, validate, build lineage, configure, or delete connectors, click on the Nine Dots menu.
- Edit Connector: Update and revalidate the data source.
- Validate Connector: Check the connection's integrity.
- Settings: Modify connector settings.
- Crawler: Configure data extraction.
- Access Instructions: Add notes on how data can be accessed.
- Business Glossary Settings: Manage term associations at the connector level.
- Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.
- Others: Configure notification recipients for metadata changes.
- Delete Connector: Remove a connector with confirmation.
Connectivity Troubleshooting
If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.
S.No. |
Error Message(s) |
Error Description/Resolution |
1 |
Error while validating connection: Please provide valid credentials: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 73GVA0Y9H15Q5K7G; S3 Extended Request ID: jmNMT5vyMU9kEiT68EgfY6IYRwTdvzSh+51qL/6IzxpguBCYe7e1JOJYLpbHOl1t2mqyKlmArTw=; Proxy: null) |
Error Description: Invalid Access Key Resolution: Provide a valid access key |
2 |
Error while validating connection: Please provide valid credentials: The request signature we calculated does not match the signature you provided. Check your key and signing method. If you start to see this issue after you upgrade the SDK to 1.12.460 or later, it could be because the bucket provided contains '/'. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: NWGSQ9BDSZ2A3H5H; S3 Extended Request ID: 319yH7h/x76swRiPpjxjs8KB/6dLrdGHrrAJs9rD2/HgQWudiMCQJMzj1ItUQAJ1zEsVm/YsCbU=; Proxy: null) |
Error Description: Invalid Secret Key Resolution: Provide a valid secret key |
3 |
Error while validating connection: Exception while fetching AWSCredentialsProvider : User: arn:aws:iam::479930578883:user/connector_testing is not authorized to perform: sts: AssumeRole on resource: arn:aws:iam::479930578883:role/airflow_MWAA (Service: AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied; Request ID: 6bd3e40e-6e9c-43e9-8f51-e631727b6afe; Proxy: null) |
Error Description: if AssumeRole Permission is missing for cross-role authentication Resolution: Create a policy with AssumeRole permission and assign it to the respective authentication role. |
4 |
Error while validating connection: Incorrect Account ID! |
Error Description: Invalid account ID Resolution: Provide a valid account ID |
Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA