File

Amazon S3

This document outlines the integration with the Amazon S3 connector, enabling streamlined metadata management through features such as crawling and data preview. Additionally, it ensures secure authentication via Credential Manager.

Overview

Connector Details

Connector Category

Cloud Storage

OvalEdge Release Current Connector Version

6.3.4

Connectivity

[How OvalEdge connects to Amazon S3]

AWS S3 SDK

OvalEdge Releases Supported

(Available from)

Release4.0

Connector Features

Feature Availability
Crawling / Cataloging
Delta Crawling
Profiling
Query Sheet NA
Data Preview
Auto Lineage NA
Manual Lineage
Secure Authentication via Credential Manager
Data Quality
DAM (Data Access Management)
Bridge

Metadata Mapping

The following objects are crawled from Amazon S3 and mapped to the corresponding UI assets.

Amazon S3 Object Amazon S3 Attribute OvalEdge Attribute OvalEdge Category OvalEdge Type
Bucket Bucket Bucket Bucket Bucket
Folder Folder Folder Folder Folder
File File File File File
XLSX File File File XLSX
XLS File File File XLS
CSV File File File CSV
TXT File File File TXT
PARQUET File File File PARQUET
ORC File File File ORC
JSON File File File JSON
YAML File File File YAML

Set up a Connection 

Prerequisites

The following are the prerequisites to establish a connection:

Service Account User Permissions

Important: It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.

Note: 👨‍💻Who can provide these permissions? These permissions are typically granted by the Amazon S3 administrator, as users may not have the required access to assign them independently.

Objects

Access Permission

Buckets

ListAllMyBuckets

GetBucketLocation

GetBucketTagging

GetEncryptionConfiguration

Folder

ListBucket

GetBucketLocation

GetEncryptionConfiguration

Files

ListBucket

GetBucketLocation

GetEncryptionConfiguration

Profile

GetObject

Connection Configuration Steps

Important: Users are required to have the Connector Creator role in order to configure a new connection.

  • Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for Amazon S3, and complete the required parameters.

Note: Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name

Description

Connector Type

By default, "Amazon S3" is displayed as the selected connector type.

Connector Settings

Authentication*

The following two types of authentication are supported for Amazon S3:

  • IAM User Authentication
  • Role Based Authentication

Credential Manager*

Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on your selection.

Supported Credential Managers:

  • OE Credential Manager
  • AWS Secrets Manager
  • HashiCorp
  • Azure Key Vault

License Add Ons


Auto Lineage

Not Supported

Data Quality

Supported

Data Access

Supported

  • Select the checkbox for Data Quality Add-On to identify data quality issues using data quality rules and anomaly detection.
  • Select the checkbox for Data Access Add-On to enable the data access functionality.

Connector Name*

Enter a unique name for the Amazon S3 connection              

(Example: "AmazonS3db").

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector.

Access key*

Enter the AWS Access Key ID used to authenticate your IAM user.

Note: This field is available only when the Authentication is selected as "IAM User Authentication".

Secret key*

Enter the AWS Secret Access Key associated with the Access Key ID.

Note: This field is available only when the Authentication is selected as "IAM User Authentication".

Cross-Account Role ARN

Enter the ARN (Amazon Resource Name) of the role used for cross-account access.

Note: This field is available only when the Authentication is selected as "Role Based Authentication".

Filter by tags

Enter one or more tags to narrow down and display only the items associated with those tags.

Region

Enter the region where your Amazon S3 files or resources are located.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.

No of Archive Objects

No Of Archive Objects*

This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.

Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.


The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.

  • After entering all connection details, the following actions can be performed:
    • Click Validate to verify the connection.
    • Click Save to store the connection for future use.
    • Click Save & Configure to apply additional settings before saving.
  • The saved connection will appear on the Connectors home page.

    Manage Connector Operations

    Crawl/Profile

    Important: To perform crawl and profile operations, users must be assigned the Integration Admin role.

    The Crawl/Profile button allows users to select one or more schemas for crawling and profiling. 

    • Navigate to the Connectors page and click Crawl/Profile.
    • Select the schemas to be crawled.
    • The Crawl option is selected by default. To perform both operations, select the Crawl & Profile radio button.
    • Click Run to collect metadata from the connected source and load it into the Data Catalog.
    • After a successful crawl, the information appears in the Data Catalog > Databases/Files/File Columns tab.

    The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.

    • Click the Schedule checkbox to enable the Select Period drop-down.
    • Select a time interval for the operation from the drop-down menu.
    • Click Schedule to initiate metadata collection from the connected source.
    • The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.

    Other Operations

    The Connectors page provides a centralized view of all configured connectors, along with their health status.

    Managing connectors includes:

    • Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.
    • Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.

    Nine Dots Menu Options:

    To view, edit, validate, build lineage, configure, or delete connectors, click on the Nine Dots menu.

    • Edit Connector: Update and revalidate the data source.
    • Validate Connector: Check the connection's integrity.
    • Settings: Modify connector settings.
      • Crawler: Configure data extraction.
      • Access Instructions: Add notes on how data can be accessed.
      • Business Glossary Settings: Manage term associations at the connector level.
      • Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.
      • Others: Configure notification recipients for metadata changes.
    • Delete Connector: Remove a connector with confirmation.

    Connectivity Troubleshooting

    If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

      S.No.

      Error Message(s)

      Error Description/Resolution

      1

      Error while validating connection: Please provide valid credentials: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 73GVA0Y9H15Q5K7G; S3 Extended Request ID: jmNMT5vyMU9kEiT68EgfY6IYRwTdvzSh+51qL/6IzxpguBCYe7e1JOJYLpbHOl1t2mqyKlmArTw=; Proxy: null)

      Error Description: Invalid Access Key


      Resolution: Provide a valid access key

      2

      Error while validating connection: Please provide valid credentials: The request signature we calculated does not match the signature you provided. Check your key and signing method. If you start to see this issue after you upgrade the SDK to 1.12.460 or later, it could be because the bucket provided contains '/'. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: NWGSQ9BDSZ2A3H5H; S3 Extended Request ID: 319yH7h/x76swRiPpjxjs8KB/6dLrdGHrrAJs9rD2/HgQWudiMCQJMzj1ItUQAJ1zEsVm/YsCbU=; Proxy: null)

      Error Description: Invalid Secret Key


      Resolution: Provide a valid secret key

      3

      Error while validating connection: Exception while fetching AWSCredentialsProvider : User: arn:aws:iam::479930578883:user/connector_testing is not authorized to perform: sts: AssumeRole on resource: arn:aws:iam::479930578883:role/airflow_MWAA (Service: AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied; Request ID: 6bd3e40e-6e9c-43e9-8f51-e631727b6afe; Proxy: null)

      Error Description: if AssumeRole Permission is missing for cross-role authentication


      Resolution: Create a policy with AssumeRole permission and assign it to the respective authentication role.

      4

      Error while validating connection: Incorrect Account ID!

      Error Description: Invalid account ID


      Resolution: Provide a valid account ID

       


      Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA