
Dataset Lineage Versioning

Dataset Lineage Versioning tracks the history, changes, and versions of data as it moves and transforms through different stages of a data pipeline. This concept is important in data management and analytics for ensuring data integrity, reproducibility, and compliance. 

Dataset Lineage

Lineage Tracking refers to the ability to trace data's origins, transformations, and movement across the data pipeline. It provides a map of data flow from source to destination.


  • Version Control

    • Maintains multiple versions of datasets to record changes over time. 
  • Historical Data
    • Enables access to previous versions of the dataset, allowing for comparison and rollback if necessary.
  • Whenever a dataset is modified in the source system, a new lineage version is built in the OvalEdge platform, which is shown first by default. 

Activate Lineage Versioning: 

This will enable lineage versioning, which stores the lineage history of datasets under the configured connections. Currently, it supports RDBMS.

  • Go to Administration > System Settings > Lineage, search for the key 'Versioning.for.lineage.connection', and enter the value as 'Connector ID'. The default value is set to empty. Users can enter multiple Connector IDs separated by commas. 
    System Settings - Lineage 1

Dataset Lineage Version History

Once configured, the dataset summary page will display a new option, ‘Version History,’ beside the code name. Clicking on the Version History option displays the list of lineage versions available for the dataset. 

Lineage Version

Selecting a version sets the lineage to the chosen version, and all tabs update with information based on that version's lineage.


Copyright © 2024, OvalEdge LLC, Peachtree Corners, GA USA