Google Cloud Storage

Connectivity Summary

An out of the box connector is available for the Google Cloud Storage (GCS) database. It provides support for crawling database objects, profiling of sample data.

The connectivity to GCS is via Google Cloud Storage, which is included in the platform.

The drivers used by the connector are given below:

Driver / API: Google Cloud Storage

Version: 1.113.16

Details: https://mvnrepository.com/artifact/com.google.cloud/google-cloud-storage

Note: Latest version is 1.117.1

Technical Specifications

The connector capabilities are shown below:

Crawling

Supported Objects	Remarks
Buckets	While crawling, Buckets / File Folders will be cataloged by default and Buckets will be crawled with tags.

Profiling

Please see Profiling Data for more details on profiling.

Feature	Support
File Profiling	Row count, Column count, View sample data
Sample Profiling	Supported

By default, the service account provided for the connector will be used for any user operations. If the service account has write privileges, then Insert / Update / Delete operations can be executed.

Pre-requisites

To use the connector, the following need to be available:

Connection details as specified in the following section should be available.
An admin / service account
The minimum privileges required for Crawling and Profiling are:
- Connection validate
- Crawl Buckets
- Catalog Files/ Folders
- Profile Files / Folders

Connection Details

The following connection settings should be added for connecting to a GCS Server:

Database Type: GCS
Connection Name: Select a Connection name for the GCS database. The name that you specify is a reference name to easily identify your SQL Server database connection in OvalEdge. Example: GCS Server Connection
ProjectId:Most Google Cloud Libraries required ProjectId.
Example:ovaledgeserver
File Path: Provide the Path of JSON file service account credentials
Filter By Tags: Buckets are based on the labels. It will crawl only the label buckets which are provided.

Validation occurs when the given projectid and projectid in the JSON file matches correctly.

Once connectivity is established, additional configurations for Crawling and Profiling can be specified.

Property	Details
Crawler configurations
Crawler Options	File Folders/ Buckets by default enabled
Crawler Rules	Include and exclude regex for File Folders and Buckets only but not for files.
Profile Settings
Profile Options	No existence for Profile
Profile Rules	No Profile rules exits

Points to note

Whenever application is deployed, we need to check if the system file path has service account credentials.
It crawls all File Folders/ Buckets with all the tags associated, when 'No Tag' is selected while creating connection.
Supported File Types: CSV, XLS, XLSX, JSON, AVRO, PARQUET, ORC