Talend

Connectivity Summary

An out of the box connector is available for Talend. It provides support for Crawling datasets i.e. Talend jobs  and Lineage building.

The connectivity to Talend is via Created Workflow Files from Talend Open Studio, which is included in the platform. Talend Open Studio is a windows application.

Windows Machine Requirements

Machine Requirements Minimum: 64-bit High Performance: 64-bit
OS Requirements Microsoft Windows 8 (64-bit) Microsoft Windows 8 (64-bit)
Chip Quad Core (single chip) Quad Core (single chip)
Processor 2.5GHz or Faster 2.5GHz or Faster
RAM 8GB 16GB
Disk Size 500GB - 1TB 500GB - 1TB

The connector currently supports the following versions of Talend:

Edition: Talend Version

Version: Default version supported

Connector Capabilities

The connector capabilities are shown below:

Crawling

Supported objects for Crawling are:

Supported Objects Remarks

Kept Talend_Schema as static Schema

Get the Workflow files from the specified path.

Providing the files as datasets and source code

 

Please see this article Crawling Data for more details on crawling. 

Lineage Building

Lineage Entities Details

Table to Table

Supported

Table-File Lineage

Supported

File - Table Lineage

Supported

Column lineage- File Column Lineage

Supported

Querying

Operation Details

Select

Supported

Insert

Not supported, by default.

Update

Not supported, by default.

Delete

Not supported, by default.

Joins within database

Supported

Joins outside database

Not supported

Aggregations

Supported

Group By

Supported

Pre-requisites

To use the connector, the following need to be available:

  • Connection details as specified in the following section should be available.
  • A service account, for crawling . The minimum privileges required are:
Operation Access Permission

Connection validate

Should have permission for the specified path

Crawl Files(only with extension of .item) with source code as data sets

Should have permission for the specified folder

Connection Details

The following connection settings should be added for connecting to an Talend:


  • Database Type: Talend
  • License Type: Standard or Auto Lineage
  • Connection Name: Select a Connection name for the Talend. The name that you specify is a reference name to easily identify the Talend connection in OvalEdge.
    Example: Talend_dev
  • Crawl From: Select from which source you want to crawl
      • Path (Default)
      • GITHUB (Partially implemented)
  • File Path: Specify the local directory path(where ovaledge application is deployed) workflows are located.
  • MetaDB Connection: Connection details of the MetaDb (database used for storing the static data for the Talend jobs)
  • MetaDB Username: Provide the valid username
  • MetaDB Password: Provide the valid password

Points to note

  1. Talend connector is not fully functional.
  2. Validate the lineage - We need to have the Talend Open Studio installed in the local to open Talend jobs and need to validate them manually by understanding the flow of the job.