-
Roadmap
-
Knowledgebase Documents
-
Installation and System Setup
-
Setup Data Catalog
-
Connectors
-
Data Discovery
-
Self Service
-
Access Management
-
Data Quality
-
Data Literacy
-
Privacy Compliance
-
Reporting
-
Architecture, Security & Releases
-
Developer's Zone
-
Advanced Tools
-
Record of Processing Activities (ROPA)
-
Others
-
Release6.0 User Guide
-
Release6.1 Features
-
Data Catalog
-
News
-
Deactivated_Old
-
Release6.3 Deep Dive Articles
AWS Glue ETL Connector
An out-of-the-box connector is available for AWS Glue ETL entities. It provides support for crawling Jobs, Workflow, Triggers, Crawlers, and lineage building for the above entities.Connectivity Summary
The connectivity to AWS Glue ETL is via AWS Glue SDK, which is included in the platform.
The Glue SDK used by the connector is given below:
Driver / API | Version | Details |
AWS Glue SDK |
1.12.232 |
https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-glue/1.12.232 Note: Latest version is 1.12.244. |
Technical Specifications
The connector capabilities are shown below:
The AWS Glue entities are created as datasets as Jobs, Workflows, Crawlers, and Triggers. We extract a job's script and build the lineage for it whereas for the crawlers, triggers, and workflows we extract the information of entities involved and build the associations accordingly.
Crawling
Feature |
Supported Objects |
Remarks |
---|---|---|
Crawling |
Jobs |
|
Workflows |
||
Crawlers |
||
Triggers |
Lineage Building
Lineage entities |
Details |
Jobs |
In Progress |
Workflows |
Supported |
Crawlers |
Supported |
Triggers |
Supported |
Pre-requisites:
To use the connector, the following need to be available:
- Connection details as specified in the following section should be available.
- An admin/service account, for crawling. The minimum privileges required are
Operation |
Access Permission |
Crawl Jobs |
LIST, GET permission on Jobs |
Crawl Workflows |
LIST, GET permission on workflows |
Crawl Crawlers |
LIST, GET permission on crawlers |
Crawl Triggers |
LIST, GET permission on triggers |
Connection Details
The following connection settings should be added for connecting to AWS Glue ETL:
Property |
Details |
Database Type |
AWS Glue ETL |
Connection Name |
Select a Connection name for the AWS Glue ETL. The name that you specify is a reference name to easily identify your AWS Glue ETL connection in OvalEdge. Example: AWS Glue ETL Connection. |
Authentication |
Select the authentication type whether it is Role-based authentication or Basic Authentication. |
Access key |
Access key |
Secret key |
Secret key |
Region |
Region of Glue |
General Authentication Connection fields:
Role-Based Authentication Connection Fields:
Points to note:
AWS Glue ETL doesn’t support querying for the Glue data catalog from OvalEdge.