ETLs

DBT Core

DBT Core (data build tool) is a development environment that enables data analysts and data engineers to transform data by simply writing select statements. In the OvalEdge application, the DBT core connector allows users to crawl the datasets like Projects, JOBS, and RUNS and helps in building the lineage.

DBT Core

Technical Specifications

Crawling

Feature

Supported Objects

Remarks

Crawling

DBT Projects

Show all the projects from DBT for the specified account id

DBT RUNS

Runs from DBT

Lineage Building

Feature

Supported Objects

Remarks

Lineage

DBT RUN

Supported

DBT JOB

Not Supported

Connection Details

Pre-requisites

To use the DBT Core Connector, the details specified in the following section should be available.

The minimum privileges required are:

Operation 

Access Permission

Connection Validation

Read Only Access 

Crawling

Read Only Access 

To establish a connection with DBT Core, follow these steps:

  1. In the OvalEdge application, click on the "Administration" module from the left panel menu.

  2. Select "Connectors", which will take users to the Connectors page.

  3. Click on the "+" (New Connector) button enabled at the top right corner of the page. This will open the ‘Add Connector’ pop-up window.

  4. To find the DBT Core connector, utilize the provided search bar and click on the widget associated with the DBT connector. This action will open a pop-up window labeled "Manage Connection".

  5. Proceed to fill in the required fields with the relevant information to configure the DBT Core connection.

Note: The asterisk (*) denotes mandatory fields required for establishing a connection. 

The following are the field attributes required for the connection of DBT Core.

Property

Mandatory/ Optional

Details

Connection Type

Mandatory

It allows users to select the connector from the drop-down menu. By default, 'DBT Core' is displayed as the selected connection type.

License Type

Mandatory

OvalEdge offers a default Base Connector License for all connectors, allowing users to crawl and profile data sources to obtain metadata and statistical information. Additionally, OvalEdge provides various License Add-Ons to cater to different connector functionality requirements:

  • Auto Lineage Add-On: To enable the automatic construction of data object Lineage for a connector, choose the Auto Lineage Add-On license, which supports the Lineage feature.

  • Data Quality Add-On: To identify, report, and resolve data quality issues for a connector that requires data quality support, opt for the Data Quality Add-On license. This license provides features such as Data Quality Rules/functions, Anomaly detection, Reports, and more.

  • Data Access Add-On: For connector access management via OvalEdge with the Remote Data Access Management (RDAM) feature enabled, select the Data Access Add-On license.

Connector Environment 

Optional

Select the desired environment for the connector from the dropdown list. The environment field provides information about the specific environment in which the connector is established such as PROD, or STG.

Note: The steps to set up environment variables are explained in the prerequisite section.

Connector Name

Optional

Provide a connection name for the DBT Core in OvalEdge. This name will serve as a reference to identify the DBT Core. 

Example: DBT Core Connection

DBT Core Repository Type

Mandatory

It allows users to select the connector from the drop-down menu. By default, 'DBT Core' is displayed as the selected connection type.

Authentication

Optional

  • IAM User Authentication verifies the identity of individual users attempting to access a system or resource.

  • Role based Authentication is not supported in DBT Core for S3.

S3 Bucket name

Mandatory

Enter the name of the S3 bucket where DBT resides.

Access Key 

Mandatory

Enter the Access Key of the IAM user.

Secret Key

Mandatory

Enter the Secret Key of the IAM user.

Enter Account Id

Mandatory

Enter the Account ID associated with the DBT Core.

Filter by tags

Optional 

Enter the tags associated with the Bucket/Object, if any.

Region

Optional

Region of S3.

SSO Connection Id

Optional

Connection Id of the identity provider’s connection [Azure, Okta, AVM … etc]

SSO Application Id

Optional

Application Id crawled from identity provider’s connection 

[Azure, Okta, AVM … etc]

RDAM Policy Folder Path

 

Bucket/Folder path in the S3 to write the policies.

SSO Role Prefix

Optional

Role name from the crawled roles of the identity provider’s connection [Azure, Okta, AVM … etc] 

Default Governance Roles

Mandatory

Steward*: Select the Steward from the drop-down options.

Custodian*: Select the Custodian from the drop-down options.

Owner*: Select the Owner from the drop-down options.

Admin Roles

Mandatory

By selecting the appropriate admin roles, users can ensure that the users associated with the selected role(s) have the required permissions and responsibilities to manage the connector effectively while maintaining security and governance. To assign the necessary admin roles for this connector, follow these steps:

  • Integration Admins: To add Integration Admin Roles, search for or select one or more roles from the drop-down options. Once selected, click on the Apply button. 

Integration Admins are responsible for configuring crawling and profiling settings for the connector, as well as deleting connectors, schemas, or data objects.

  • Security & Governance Admins: To add Security and Governance Admin roles, search for or select one or more roles from the drop-down options. Once selected, click on the Apply button.

Security and Governance Admins have the following responsibilities:

  • Configuring role permissions for the connector and its associated data objects.

  • Adding Security and Governance Admins to set permissions for roles on the connector and its associated data objects.

  • Updating governance roles.

  • Creating custom attributes.

  • Creating Service Request templates for the connector.

No. of Archive Objects 

Mandatory

This number represents the most recent modifications made to the metadata data of a remote/source (for report connection purposes) and can be set during the crawling connection setup.

Select Bridge

Optional

The OvalEdge Bridge component enables seamless connectivity between cloud-hosted servers and on-premise or public cloud data sources without requiring modifications to firewall rules. It offers real-time control, simplifying the management of data movement between different sources and destinations. 

For more information, refer to Bridge Overview

Path

Mandatory

If ‘NFS’ type is selected, then path of DBT projects is mandatory.

Note: 

  • When a user chooses the DBT Core Repository type as 'NFS', it is mandatory to input the NFS path.

  •  Folder Structure: Parent folder (dbt_artifactory) should only contain DBT folders.

  • For S3, Bucket name should not contain any slash (/ or \). Just bucket name is sufficient.

  • The Manifest.json file must be present in the target folder.

After filling in all the AWS Secrets Manager connection details, select the appropriate button based on users preferences. 

    1. Validate: Click on the Validate button to verify the connection details. This ensures that the provided information is accurate and enables successful connection establishment.

    2. Save: Click on the Save button to store the connection details. Once saved, the connection will be added to the Connectors home page for easy access.

Error Validations Details

The following are expected errors that can be encountered while establishing the connection.

S.No.

Error Messages

Description

1

Failed to establish a connection, Please check the credentials

Provide valid credentials or ensure proper access.

2

Error occurred while validating DBT Core connection.

403: Access denied [Provide appropriate access to user or role using in connection]

404: No such key [The object does not exist in the remote.] 

Connector Settings 

Configure the DBT Core Connector settings for crawling by following these steps:

  1. Select the DBT Core Connection Name from the Crawler Information page.

  2. Click on the 9 dots buttons and choose "Settings".

  3. A "Connection Settings" pop-up window will appear for further configuration.

Configure the Lineage settings as follows:

  1. Select multiple servers to build lineage.

  2. If choosing more than one server, the lineage will be attempted with the first server type. If it fails, the second server type will be used, and so on.

  3. Configure multiple servers and prioritize connections when selecting tables for lineage building.

  4. Use the "Source Server Type" and "Connections Priority" sections to have precise control over the lineage process.

FAQ’s

Why DBT Core does not have column lineage:

  • The DBT API does not support column-level lineage.

  • As a result, OvalEdge does not have access to parse and display column lineages for source tables.


Copyright © 2023, OvalEdge LLC, Peachtree Corners GA USA