ETLs

AWS Glue ETL Connector

An out-of-the-box connector is available for AWS Glue ETL entities. It provides support for crawling Jobs, Workflow, Triggers, Crawlers, and lineage building for the above entities. 
 

Connectivity Summary

1-Nov-11-2022-09-08-44-2332-AM

The connectivity to AWS Glue ETL is via AWS Glue SDK, which is included in the platform. 

The Glue SDK used by the connector is given below:

Driver / API Version Details
AWS Glue SDK

1.12.232

https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-glue/1.12.232

Note: Latest version is 1.12.244.

Technical Specifications

The connector capabilities are shown below:
The AWS Glue entities are created as datasets as Jobs, Workflows, Crawlers, and Triggers. We extract a job's script and build the lineage for it whereas for the crawlers, triggers, and workflows we extract the information of entities involved and build the associations accordingly.

Crawling

Feature

Supported Objects

Remarks

Crawling

Jobs

 
 

Workflows

 
 

Crawlers

 
 

Triggers

 

Lineage Building

Lineage entities

Details

Jobs

In Progress

Workflows

Supported

Crawlers

Supported

Triggers

Supported

Pre-requisites:

To use the connector, the following need to be available:

  • Connection details as specified in the following section should be available.
  • An admin/service account, for crawling. The minimum privileges required are

Operation 

Access Permission

Crawl Jobs

LIST, GET permission on Jobs

Crawl Workflows

LIST, GET permission on workflows

Crawl Crawlers

LIST, GET permission on crawlers

Crawl Triggers

LIST, GET permission on triggers

Connection Details

The following connection settings should be added for connecting to AWS Glue ETL:

Property

Details

Database Type

AWS Glue ETL

Connection Name

Select a Connection name for the AWS Glue ETL. The name that you specify is a reference name to easily identify your AWS Glue ETL connection in OvalEdge. 

Example: AWS Glue ETL Connection.

Authentication

Select the authentication type whether it is  Role-based authentication or Basic Authentication.

Access key

Access key

Secret key

Secret key

Region

Region of Glue

General Authentication Connection fields: 

AWS ETL Glue-1

Role-Based Authentication Connection Fields:

Role Based Glue-1

Points to note:

AWS Glue ETL doesn’t support querying for the Glue data catalog from OvalEdge.