SFTP

Connectivity Summary

An out of the box connector is available for the SFTP database. It provides support for crawling database objects, profiling of sample data.

Jsch allows you to connect to a 'ssh server' and use port forwarding and file transfer.

The connectivity to SFTP is via Jsch library, which is included in the platform. 

The drivers used by the connector are given below:

Driver / API: Google Cloud Storage

Version: 0.1.5

Details: https://mvnrepository.com/artifact/com.jcraft/jsch

Note : Latest version is 0.1.55

Technical Specifications

The connector capabilities are shown below:

Crawling

Feature Supported Objects Remarks
Crawling Buckets While crawling root FileFolders / Files  will be cataloged by default.

Profiling

Please see Profiling Data for more details on profiling.

Feature

Support

Remarks

File Profiling

Row count, Columns count, View sample data

Supported File Types: CSV, XLS, XLSX, JSON, AVRO, PARQUET, ORC

Sample Profiling

Supported

 

By default, the service account provided for the connector will be used for any user operations. If the service account has write privileges, then Insert / Update / Delete operations can be executed.

Pre-requisites

To use the connector, the following need to be available:

  • Connection details as specified in the following section should be available.
  • An admin / service account, for crawling and profiling.
  • The minimum privileges required are:
    • Connection validate
    • Crawl File Folders
    • Catalog files / folders
    • Profile files / folders

Connection Details

The following connection settings should be added for connecting to a SFTP Server:

  • Database Type: SFTP
  • Connection Name: Select a Connection name for the SFTP Server. The name that you specify is a reference name to easily identify your SFTP SERVER connection in OvalEdge.
    Example: sftp Server Connection
  • Server: Where SFTP server hosted
  • Username: User account login credential
  • Password: Specified User password 
  • Port number:  22 (default), it can be changed
  • Path: The directory of SFTP path - required to crawl else it will crawl the root directory

Once connectivity is established, additional configurations for Crawling and Profiling can be specified.

Property

Details

Crawler configurations

Crawler Options

File Folders / Buckets by default enabled

Crawler Rules

Include and exclude regex for File Folders and Buckets only but not for files

Profiler Settings

Profile Options

No Existence for Profile

Profile Rules

No Profile Rules Exist

Points to note

  1. Supported File Types: CSV, XLS, XLSX, JSON, AVRO, PARQUET, ORC
  2. Details of the file / folder can be viewed in the File Manager for which user has access.