Connectivity Summary
An out of the box connector is available for the SFTP database. It provides support for crawling database objects, profiling of sample data.
Jsch allows you to connect to a 'ssh server' and use port forwarding and file transfer.
The connectivity to SFTP is via Jsch library, which is included in the platform.
The drivers used by the connector are given below:
Driver / API: Google Cloud Storage
Version: 0.1.5
Details: https://mvnrepository.com/artifact/com.jcraft/jsch
Note : Latest version is 0.1.55
Technical Specifications
The connector capabilities are shown below:
Crawling
Feature | Supported Objects | Remarks |
Crawling | Buckets | While crawling root FileFolders / Files will be cataloged by default. |
Profiling
Please see Profiling Data for more details on profiling.
Feature |
Support |
Remarks |
File Profiling |
Row count, Columns count, View sample data |
Supported File Types: CSV, XLS, XLSX, JSON, AVRO, PARQUET, ORC |
Sample Profiling |
Supported |
By default, the service account provided for the connector will be used for any user operations. If the service account has write privileges, then Insert / Update / Delete operations can be executed.
Pre-requisites
To use the connector, the following need to be available:
- Connection details as specified in the following section should be available.
- An admin / service account, for crawling and profiling.
- The minimum privileges required are:
- Connection validate
- Crawl File Folders
- Catalog files / folders
- Profile files / folders
Connection Details
The following connection settings should be added for connecting to a SFTP Server:
- Database Type: SFTP
- Connection Name: Select a Connection name for the SFTP Server. The name that you specify is a reference name to easily identify your SFTP SERVER connection in OvalEdge.
Example: sftp Server Connection - Server: Where SFTP server hosted
- Username: User account login credential
- Password: Specified User password
- Port number: 22 (default), it can be changed
- Path: The directory of SFTP path - required to crawl else it will crawl the root directory
Once connectivity is established, additional configurations for Crawling and Profiling can be specified.
Property |
Details |
Crawler configurations |
|
Crawler Options |
File Folders / Buckets by default enabled |
Crawler Rules |
Include and exclude regex for File Folders and Buckets only but not for files |
Profiler Settings |
|
Profile Options |
No Existence for Profile |
Profile Rules |
No Profile Rules Exist |
Points to note
- Supported File Types: CSV, XLS, XLSX, JSON, AVRO, PARQUET, ORC
- Details of the file / folder can be viewed in the File Manager for which user has access.