The data Catalog files list all the cataloged folders and files from every connected file system. Any root folder will have multiple subfolders and files in it. You can select a cataloged folder or a file to build the metadata around it. You can associate files with tags, and terms, you can author business descriptions, add source and target lineage, and add a reference to multiple data objects. This search option gives the list of each file and its unique File location. You can select a file or a folder to view its contents. OvalEdge supports file systems such as comma-separated value(.csv), Microsoft Excel (.xlsx), JSON, JSON (deeply nested), and pipe-delimited value, AVRO and Parquet files.
File |
Description |
---|---|
Type |
The Data Catalog Files consist of Folders and Files. Folders may have one or more folders or files in them. Depending on the file type (FILE or FOLDER), it will filter the results accordingly. |
File System |
Displays the connection name of the File. |
File Name |
Displays the name of the File. |
File Location |
Displays the location of the File. |
Access Cart |
Add to access cart icon will add the file object to the access cart. |
Tags |
Display the Tags assigned to the database object. The Tags field is editable; hover over a specific tag field to see an edit icon. Click on the edit icon to edit and assign tags to the object. |
Term |
Displays the Terms assigned to the database object. The Terms field is editable; hover over a specific tag field to see an edit icon. Click on the edit icon to edit, assign, or remove tags to the object. |
Remote Tags |
Displays the Remote Tags added to the File. |
Business Description |
Displays the business description added to the database object. It is editable; click on the edit icon to edit the description. |
Technical Description |
Displays the Technical Description added to the database object. It is editable; click on the edit icon to edit the description. |
Metadata |
Displays the Metadata information added. |
Created Date |
Displays the date on which the File is created. |
Last Modified Date |
Displays the last modified Date & Time of the File. |
Popularity |
It displays the count of the number of times users have interacted with this data object by viewing, endorsing, commenting, adding tags, or querying it. |
Steward |
Displays the name of the steward. |
Custodian |
Displays the name of the custodian. |
Owner |
Displays the name of the owner. |
Certification |
You can certify an object with Certify/Caution/Violation/Inactive/ None options. |
Folders
When a folder is selected, the OvalEdge data catalog provides the folder statistics, and when a file is selected the OvalEdge data catalog provides the file statistics. When you select a folder, the detail page has a similar layout to the database objects in the data catalog module. The File page is further divided into Summary, Sub Files, Data, Cataloged Files, Lineage, and References.
Based on the type of file selection, the following tabs are displayed:
- Based on the folder, the cataloged files tab is displayed.
- Based on the data file (.csv) data tab is displayed.
- Based on the type of data files, the subfile details are displayed for the files such as XLS and XLSX.
Folder Summary
The Folder summary provides all the meta-data information and statistical details of a specific folder The metadata includes descriptive information about the data, such as Folder Title, Business Description, Technical description, Tags and Terms added Permissions, when and how the data object is created, and the creator. The statistical data fetches and displays the parameters related to the File or Folder, including Row count, Column count, Service request count, Access, Quality Index, Popularity, Profiled, Profile Date, Importance, Size, and other modification details such as Last Catalog date, Last populated date, Last modified date, Certified date and other information such as Profile Status, Type, and Dashboard.
Catalog Files
Lists all the files and folders cataloged with the selected folder. The Cataloged Files tab provides the list of folders from the connection. It will catalog at the Folder level and sub-folder level.
Cataloged Files |
Description |
---|---|
Position |
It shows the position of the file available in the folder |
Type |
File type describes the type of a File or Folder |
File Name |
Name of a file |
Description |
Short description of a file |
Extension |
Extension of a file such as CSV |
Status |
Sample |
Row count |
The total number rows count is displayed |
Dashboard |
DQR score with dimension |
Lineage
The source and the target lineage of the folder selected. Data Lineage is a visual representation that helps view the origin of the data, describes the path, and shows how it reaches the target and all the transformations it undergoes in its lifecycle. The lineage of a file showcases the upstream and Downstream objects associated with a single File object. It provides the ability to track, understand, record, and visualize the data transformation along with its path from source to destination.
References
File References lists all references made to the File from other data objects (Database Tables or File itself) in the application. The benefit of having solid reference data is that you can confidently drill into subsets of your data to gain business insights.
Files
Data Catalog Files also have the option to look for files. You can select the Type filter to search a file type and click on a particular file.
Similarly, when you select a file, you can observe the details of a selected file type. The file detailed page consists of the following tabs:
File Tabs |
Descriptions |
---|---|
File Summary |
File descriptions File Statistics Tags and Terms associated with the folder |
Data |
It displays the Data of a file. |
Sub Files |
Based on the type of data files, the subfile details are displayed for the files such as XLS and XLSX. |
Lineage |
The source and the target lineage of the folder selected |
References |
List of all referenced objects to this folder |
Column Details |
Statistics Column Lineage and Reference |
File Summary
Files Summary |
Description |
---|---|
Business Description |
A business description provides a clear understanding of the data objects (Tables/Files/Reports) and their function. It is descriptive information about the data object and its fields that will be helpful for business users. By default, the description box is empty, and the user can update it accordingly. Click on the edit icon to edit the Business description. Note: Users with access to Meta Write (Read-Write) can only edit the Business Descriptions. |
Technical description |
Displays the technical parameters or comments defined at the data source. It is editable; click on the edit icon to modify the existing technical description. |
Terms |
Displays the term associated with the files. The terms field is editable. Click on the edit icon to edit, assign, or remove the Term to/from the table. |
Tags |
Displays the tags associated with the files. The tags field is editable; click on the edit icon to edit and assign tags to the table. |
Last Populated Date |
It displays the Date and Time on which the last modifications are made to the data object. |
Service request count |
It displays the count of service requests made on the data catalog files to Request Content change / Request Access / Report Data Quality. |
Access |
The Instruction button is used to view the instructions added to the Access Instructions field through Administration > Crawler > Settings > Access Instructions. Crawler administration will benefit from the valuable information it provides. |
Quality Index |
The Data Quality index gives an overall idea about the quality of the Table object based on the Service Tickets raised and resolved. If there are multiple issues that are not resolved, this signifies that data quality is poor. |
Popularity |
The Popularity Score displays the number of times the users interacted with this data object by viewing, endorsing, commenting, adding tags, or by querying the data object. The total number of view counts is displayed to show how popular the data asset is relative to other assets in the application. |
Profiled |
The profiled status field in the object summary page displays as “No” or “Yes”. |
Profiled Date |
It displays the latest date and time that a specific data object is profiled to compute statistics of new data. If this attribute is empty, you can identify that the object has not been profiled. |
Importance |
The Importance score shows how vital a File object is across the database based on the lineage (downstream objects) associated with the File. |
Catalog Date |
The file cataloged date represents the date when the file information gets saved in the OvalEdge application. |
Profile Status |
The Status field displays the profiling state of the Table data object.
|
Type |
Type displays the specific file type whether it is folder or file. |
Profiling the options for folders and files
- You can use the “Profile All Files” option to profile files inside a folder.
- You can use the “profiling a folder assuming same content” option to profile a folder, which will only profile to a specific level of folder. By specifying the 3rd level, it means that it will only profile the folder of the 3rd level, thereby skipping the 1st and 2nd levels.
Note: A job is submitted once the File/Folder is profiled. You can see the statistics values at the bottom of the summary page of a file or folder.
File Data
The Data tab is displayed based on the File type selected. When you crawl, it displays a file's top 100 records (rows) in the grid.
Lineage
The lineage graph displays the data movement and data flow from the source to the destination of file objects.
Reference
The File References list refers to all data objects (such as database tables and the File itself) that reference the File.
Column Details
By default, the Column Details tab is inactive. When you profile a file or folder, the Column Details tab becomes active, providing the information for the respective Column Summary, Column Lineage, and Column References tabs for that profiled file folder.
For more information, see the Data Catalog File Columns document.
User Actions
The following are the User Actions that can be performed on the Files, Folders, and File Columns using the Nine dots option. For more information, please refer to the link User Action.
Nine Dots options |
Description |
---|---|
Add Tag |
It adds the Tags to the selected File(s). |
Remove Tag |
It removes the Tags applied to the selected File(s). |
Add Term |
It adds the Terms to the selected File(s). |
Remove Term |
It removes the Terms applied to the selected File(s). |
Add to Default Project / Add to My Access Cart |
It adds the selected File(s) to the default Project set in the Project module. The Add to My Access Cart option is displayed if the Default Project is set to My Access Cart Project. |
Service desk |
It serves the purpose of reporting any data quality issue or content change request on files and columns of files. |
Remove from Default Project |
It removes the selected File(s). from the default Project. |
Update Governance Roles |
It helps you to change or update the governance roles (Owner / Custodian / Steward and other roles) for the selected File(s). |
Change Certification Type |
It helps to change the certification type for the selected File(s). The data certification is a stamp of approval to ensure the data is consistent, timely, and correct. It lets you filter reports based on their certification status.
|
Add Files to Impact Analysis |
It helps to see the impact of changes made to the File(s) on other data objects upstream or downstream. The Impacted objects can be viewed in Advanced Tools > Impact Analysis. |
Quick Tips |
It provides a few insights about the File(s). |
Copyright © 2019, OvalEdge LLC, Peachtree Corners GA USA