Data Warehouse

Snowflake Connector

The Snowflake platform offers flexible storage, processing, and analysis of data. It is a secured cloud-based data warehouse connector built on AWS or Azure cloud infrastructure. 

OvalEdge provides an interface that can connect to Snowflake and perform all standard operations. The snowflake data source is integrated via the JDBC driver that supports crawling database objects, profiling sample data, query execution, building lineage, remote policy, and access permission.

Snowflake Editions

There are multiple editions of Snowflake, and users can choose the version that fulfills the organization's requirements. Below is the list of Snowflake’s editions supported by OvalEdge: 

  • Standard
  • Enterprise
  • Business Critical
  • Virtual Private Snowflake (VPS)

Connectors Capabilities

The following is the list of objects and data types supported by the Snowflake connector.

Functionality Description
Crawling Tables, Views, Columns, Functions, Procedures, Synonyms, Triggers, Roles, Users, Permissions, Triggers, Usage statistics, Tasks, Pipes
Profiling Table Profiling: Row count, Column count, and View sample data
View Profiling: Row count,   Column count, View sample data
Column Profiling: Min, Max, Null count, Distinct count, Top values
Full Profiling
Lineage Building Table Lineage, Column Lineage
Lineage Sources: Stored procedures, functions, triggers, views, SQL queries
Query Execution Select, Joins within the database, Aggregations, Group By, Order By

Prerequisites

The following are the prerequisites required for establishing a connection between the connector and the OvalEdge application. 

  1. Driver Details
  2. service Account Permission
  3. Configure environment variables (Optional)

Driver Details

The drivers used by the connector are given below:

Driver 

Version

Details

Snowflake JDBC driver

3.14.5

https://repo1.maven.org/maven2/net/snowflake/snowflake-jdbc/3.14.5/

Service Account Permissions

Operations

Minimum Permissions

Validate

USAGE on the database with valid credentials

Crawling

(Schema, Tables, Views/Materialized Views, Columns)

SELECT on information_schema.schemata,  information_schema.tables,  information_schema.columns

Access to run shows primary keys in {db}.{schema}

Crawl Column Relationships

Access to run SHOW IMPORTED KEYS

Crawl Stored Procedures

SELECT on information_schema.procedures

Crawl Functions

SELECT on information_schema.functions

Crawl Stages

Access to run SHOW STAGES IN SCHEMA {schema}

Crawl Pipes

SELECT on information_schema.STAGES, Access to run SHOW PIPES IN {db}

Crawl Tasks, Streams

Access to run SHOW TASKS IN {db}, Access to run SHOW STREAMS IN {db}

Crawl FileFormats

Access to run SHOW FILE FORMATS IN {schema}

Profiling and Querysheet

USAGE on Schema, SELECT on table

Lineage

USAGE on Schema, SELECT on SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY


Note: In order to provide the above permissions, use the below queries if required.

User Account Credentials 

  • Create a snowflake account to establish connection to OvalEdge

    • CREATE USER OE_User password='<<password>>' default_role=OE_Role default_warehouse='<<warehouse>>' display_name='OE User';

  • Create a Role OE_Role to assign required permissions 

    • CREATE OR REPLACE ROLE OE_role;

    • GRANT OPERATE, USAGE ON WAREHOUSE "<<warehouse>>" TO ROLE OE_role;

  • Grant Role OE_Role to user OE_User

    • GRANT ROLE oe_role TO USER oe_user;

  • Grant permission on INFORMATION_SCHEMA to OE_Role to be able to get the meta data.

    • GRANT USAGE ON SCHEMA information_schema TO OE_Role;

  • Grant Permission to get KEYS data 

    • GRANT USAGE ON COMMAND "SHOW IMPORTED KEYS" TO OE_role;

  • Grant read access to database, schemas to preview data and query

    • GRANT USAGE ON DATABASE "<<db-name>>" TO ROLE OE_Role;

    • GRANT USAGE ON ALL SCHEMAS IN DATABASE "<<db-name>>" TO ROLE OE_role; 

    • GRANT SELECT ON ALL TABLES IN DATABASE "<<db-name>>" TO ROLE OE_Role;

    • GRANT SELECT ON ALL VIEWS IN DATABASE "<<db-name>>" TO ROLE OE_Role;

    • GRANT SELECT ON ALL MATERIALIZED VIEWS IN DATABASE "<db-name>>" TO ROLE OE_Role;

  • Grant read access to future schemas

    • GRANT USAGE ON FUTURE SCHEMAS IN DATABASE "<<db-name>>" TO ROLE OE_Role;

Establish Environment Variables (Optional)

This section describes the settings or instructions that you should be aware of prior to establishing a connection. If your environments have been configured, skip this step.

Configure Environment Names

The Environment Names allow you to select the environment configured for the specific connector from the dropdown list in the Add Connector pop-up window.
You might want to consider crawling the same schema in both stage and production environments for consistency. The typical environments for crawling are PROD, STG, or Temporary, and may also include QA or other environments. Additionally, crawling a temporary environment can be useful for schema comparisons, which can later be deleted, especially during application upgrade assistance. 

Steps to Configure the Environment

  1. Navigate to Administration > System Settings.
  2. Select the Connector tab.
  3. Find the Key name “connector.environment”.
  4. Enter the desired environment values (PROD, STG) in the value column. 
  5. Click ✔ to save. 

Establish a connection

To establish a Connection with PostgreSQL, fill in the required fields with the relevant information in the Manage Connector pop-up window:

  1. Log into the OvalEdge application.
  2. In the left menu, click on the Administration module name and click on the Connectors sub-module name. The Connectors Information page is displayed.
  3. Click on +icon New Connector. The Add Connectors pop-up window is displayed.
  4. Select the connection type as Snowflake. The Add Connector with Snowflake is displayed.

    Fields

    Details

    Connector Type

    The selected connection type ‘Snowflake’ is displayed by default. 

    If required, the drop-down list allows you to change the connector type and based on the selection of the connection type, the fields associated with the selected connection type are displayed.

    Authentication

    Snowflake data security supports two forms of authentication mechanisms. 

    (i) Username and Password: The simplest form of authentication to validate a connection using a username and password.
    (ii) RSA Key-Pair: The key-pair authentication is an alternative to basic authentication. It encrypts private keys to connect to Snowflake.  

    Example: alter user YOURUSERNAME set rsa_public_key='pass PEM file Keystring'.

    Credential Manager

    Select the option from the drop-down list, where you want to save your credentials.

    Database: The connection is configured with the basic Username and Password of the service account in real-time when OvalEdge establishes a connection to the Snowflake database. 

    HashiCorp: The credentials are stored in the HashiCorp database server and fetched from HashiCorp to OvalEdge.  

    AWS Secrets Manager: The credentials are stored in the AWS Secrets Manager database server and fetched from the AWS Secrets Manager to OvalEdge.

    Azure Key Vault: The credentials are stored in the Azure Key Vault database server and fetched from the Azure Key Vault to OvalEdge. Click here to know more.

    For more information on Credential Manager, refer to Credential Manager

    License Add-Ons

    All the connectors will have a Base Connector License by default that allows you to crawl and profile to obtain the metadata and statistical information from a data source. 

    OvalEdge supports various License Add-Ons based on the connector’s functionality requirements.

    • Select the Auto Lineage Add-On license that enables the automatic construction of the Lineage of data objects for a connector with the Lineage feature. 
    • Select the Data Quality Add-On license to identify, report, and resolve the data quality issues for a connector whose data supports data quality, using DQ Rules/functions, Anomaly detection, Reports, and more.
    • Select the Data Access Add-On license that will enforce connector access via OvalEdge with Remote Data Access Management (RDAM) feature enabled.
    Credential Manager Connection Id Specify the credential manager connection id generated for HashiCorp or AWS Secret Manager after building a connection in OvalEdge Connectors.
    Connector Name*

    Enter a Connection name for Snowflake. You can specify a connection name to identify the Snowflake connection in OvalEdge.

     Example: Snowflake_Connection_DB1

    Connector Environment

    The environment dropdown menu allows you to select the environment configured for the connector from the dropdown list. For example, PROD, or STG.

    The purpose of the environment field is to help you understand that the new connector is established in an environment available at the  Production, STG, and QA.

    Note: The steps to set up environment variables in explained in the prerequisites section.

    Server*

    Enter the name of the Snowflake database instance server URL, which is accessible by the OvalEdge application. 

    Format: <account>.snowflakecomputing.com

    Port*

    Port number 443 is the default port used by Snowflake when connecting to a server over the internet.
    Database Provide the name of the database that is associated with the Snowflake connection, if necessary.
    Warehouse Provide the name of the warehouse that is associated with the snowflake connection, if necessary.
    Role

    Enter the name of the role that needs to be assigned with the Snowflake privilege role. 

    Example: ACCOUNT ADMIN, SYSADMIN

    Driver*

    A JDBC driver is a Java library file with the extension .jar that connects to a database. The driver details associated with the Snowflake database will be auto-populated by default.

    Example: net.snowflake.client.jdbc.SnowflakeDriver
    Source CodeView

    It provides the name of the view that consists of source code (queries) to build lineage.

    For more information, please refer to Section Additional Information.

    Username*

    Enter the Service Account to access the required database of Snowflake connector.

    Note: By default, the application Username and password will be displayed in place of the Username and Password. However, the user needs to clear the details and enter the service account credentials for the Snowflake server.

    Password* Enter the Service Account password to access the required database of the Snowflake connector.
    SSO Connection Id Snowflake supports Single Sign On (SSO), which is a secured authentication process.  Specify the Connection Id of the identity provider’s connection (Azure, Okta, AVM, etc.)
    SSO Application Id Specify the Application Id available from the identity provider’s connection (Azure, Okta, AVM, etc.)
    SSO Application Role Id Specify the Application Role Id available from the crawled roles of the identity provider’s connection (Azure, Okta, AVM, etc.)
    Connection String  Set the Connection string toggle button to automatically get the details from the credentials provided. Alternatively, the user can manually enter the string.

    Format: jdbc: snowflake://{server}:443/?db={sid}&warehouse={warehouse}&role={IAMRole}

    Plug-in Server Enter the server name if it is running as a plugin.
    Plug-in Port Enter the Port Number if it is running as a plugin.
    Default Governance Roles*

    The admin will select a specific user or a  team from the governance roles (Steward, Custodian, Owner) that get assigned to the data asset. 

    The dropdown list displays all the configurable roles (single user or a team) as per the configurations made in the OvalEdge Security>Governance Roles.

    Admin Roles

    Select the required admin roles for this connector.

    • To add Integration Admin Roles, search for or select one or more roles from the Integration Admin options, and then click on the Apply button. 
      The responsibility of the Integration Admin includes configuring crawling and profiling settings for the connector, as well as deleting connectors, schemas, or data objects.
    • To add Security and Governance Admin roles, search for or select one or more roles from the list, and then click on the Apply button. 
      The security and Governance Admin is responsible for:
      • Configure role permissions for the connector and its associated data objects.
      • Add admins to set permissions for roles on the connector and its associated data objects.
      • Update governance roles.
      • Create custom fields.
      • Develop Service Request templates for the connector.
      • Create Approval workflows for the templates.
    No of Archive Objects*

    It is the number of last modifications made in the metadata data of a dataset at Remote/source. By default, the number of archive objects is set to disable mode. Click on the Archive toggle button and enter the number of objects wish to archive.

    For example, if a user updates the count as 4, the connection is crawled. It will provide the last 4 changes that occurred in the remote/source of the connector. Users can observe these changes in the ‘version’ column of the ‘Metadata Changes’ module.

    Select Bridge

    With the OvalEdge Bridge component, any cloud-hosted server can connect with any on-premise or public cloud data sources without modifying firewall rules. A bridge provides real-time control that makes it easy to manage data movement between any source and destination. 

    For more information, refer to Bridge Overview

    Note: * (asterisk) indicates the mandatory field required to establish a connection. Once all the parameters are entered, you can validate the details and save the connection that will be displayed on the Connector Home page.

    Note: You can either save the connection details first, or you can validate the connection first and then save it.

    Connection Validation Errors:

    S.No.

    Error Message(s)

    Description

    1 Failed to establish a connection. Please check the credentials. Invalid credentials are provided or the user or role does not have access.
    2

    Errors while downloading the File.

    403: Access denied [Provide appropriate access to user or role using in connection]

    404: No such key [The object does not exist in the remote.]

    Note: If you have any issues creating a connection, please contact your assigned OvalEdge Customer Success Management (CSM) team.

    Connector Settings 

    Once the connection is established successfully, various settings are provided to fetch and analyze the information from the data source. 

    Connection Settings Description
    Crawler Crawler settings are configured to connect to a data source and collect and catalog all the data elements in the form of metadata.
    Profiler Profiling is the process of gathering statistics and informative summaries about the connected data source(s). Statistics can help assess the data source's quality before using it in an analysis. Profiling is always optional; crawling can be run without profiling.
    Data Access

    The Remote Access tab lists the data objects and the meta and data permissions on these objects that a user is assigned access to in a remote application.

    (i) Crawler Options
    (ii) Data Access Management

    (iii) Data Access Authorization

    (iv) Connector Policies

    Query Policies

    The Query Policies in the Crawler setting provide the right access to the Query sheet functions (Join, Union, SUM, or aggregate functions). You can specify the desired roles and permission to deny the usage of the query sheet function. A role that has been denied policy permission will not have access to those functions in the Query Sheet.

    Example: If the user selects the Role as “OE_HRADMIN,” Query Type as “JOIN,” and the Access Type as “DENY,” then the users associated with the OE_HRADMIN privileges are restricted from using the JOIN function in the Query Sheet page.
    Access Instruction Access Instruction allows the data owner to instruct others on using the objects in the application. 
    Business Glossary Settings The Business Glossary setting provides flexibility and control over how they view and manage term association within the context of a business glossary at the connector level. 
    Notification

    The Send Metadata Changes Notifications option is used to set the change notification about the metadata changes of the data objects.

    • You can use the toggle button to set the Default Governance Roles (Steward, Owner Custodian, etc.) 
    • From the drop-down list, you can select the role and team to receive the notification of metadata changes.

    Note: For more information, refer to the Connector Settings.

    The Crawling of Schema(s)

    You can use the Crawl/Profile option, which allows you to select the specific schemas that need to be crawled, profiled, or unprofiled. For any scheduled crawlers and profilers, the defined run date and time are displayed to set.

    • Navigate to Administrator | Connectors, select the radio button for the particular schema, and click on Crawl/Profile
    • It allows you to select the specific schemas that need to be crawled, profiled, unprofiled, or scheduled.  
    • Click on the Run that gathers all metadata from the connected source into OvalEdge Data Catalog. 

    Note: For more information on Scheduling, refer to Scheduling Connector

    Additional Information

    Snowflake Permissions To Read View/Procedure/Function Definitions

    Snowflake Procedure definitions/Function definitions can only be read by the OWNERS of the Procedures i.e., EXECUTE AS OWNERS and SECURE VIEW definitions can only be viewed by OWNERS. But OvalEdge requires only READ permissions for all the operations. The following steps are used to grant READ permissions on the Procedures table in the Acount_Usage schema.

    This helps Ovaledge to fetch Views(secure views, materialized views), Procedures, and Functions that are EXECUTED BY OWNERS.

    Step 1: Create a View with the following select statement in any one of the schemas Using ADMIN LEVEL ROLE

    View Code :  

    CREATE OR REPLACE
    VIEW PROC_VIEW_DEF AS (
    SELECT PROCEDURE_CATALOG, PROCEDURE_SCHEMA,PROCEDURE_NAME, ARGUMENT_SIGNATURE AS P_SIGN , DATA_TYPE AS P_DT,PROCEDURE_LANGUAGE, PROCEDURE_DEFINITION,'PROCEDURE' AS TYPE , DELETED FROM "SNOWFLAKE"."ACCOUNT_USAGE"."PROCEDURES" WHERE DELETED IS NULL

    UNION

    SELECT FUNCTION_CATALOG,FUNCTION_SCHEMA,FUNCTION_NAME,ARGUMENT_SIGNATURE AS F_SIGN, DATA_TYPE AS F_DT,FUNCTION_LANGUAGE,FUNCTION_DEFINITION,'FUNCTION' AS TYPE,DELETED

    FROM "SNOWFLAKE"."ACCOUNT_USAGE"."FUNCTIONS" WHERE DELETED IS NULL)

    UNION 

     (SELECT TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,'','','',VIEW_DEFINITION,'VIEW' as TYPE, TO_TIMESTAMP_NTZ('2023-01-01 00:00:00') FROM "SNOWFLAKE"."ACCOUNT_USAGE"."VIEWS"

    WHERE DELETED IS NULL);

     

    Step 2: Grant SELECT On the View created to the Sevice account Role (OvalEdge Account)

    GRANT PERMISSION SQL: 

    GRANT SELECT ON VIEW “<VIEW NAME>” TO ROLE “<ROLE NAME>”

    Step 3: Enter the View name to the Source Code View field on the connection details screen, as shown in the following screenshot.


    Sourcecode_img

    Step 4: Provide the view name in the format(DATABASENAME.SCHEMANAME.VIEWNAME) as mentioned in the below screenshot 

    FAQs

    1. How much does the driver cost?
      The JDBC Driver for SNOWFLAKE is available at no additional charge.
    2. Can I use the driver to access Snowflake from a Linux computer?
      Yes! You can use the driver to access Snowflake from Linux, Unix, and other non-Windows platforms. 

    Copyright © 2023, OvalEdge LLC, Peachtree Corners GA USA