Data Quality Functions are advanced analytical functions designed to compute and validate numeric and alphanumeric data. OvalEdge provides a rich library of 57 predefined Data Quality Functions that facilitate a wide range of computations for data quality management. These functions can be applied to various data objects, including Tables (TB), Table Columns (TC), Files (FL), File Columns (FC), or SQL queries (SQL).
Let's explore some of the key aspects of OvalEdge's Data Quality Functions with sample examples to enhance understanding:
- Supported Data Objects: OvalEdge's Data Quality Functions can be executed on the following data objects:
- Tables (TB): This includes structured data stored in database tables.
- Table Columns (TC): These are individual columns within database tables.
- Files (FL): These are files uploaded or stored in the OvalEdge application.
- File Columns (FC): These represent specific columns within files.
- SQL Queries (SQL): These are custom SQL Queries executed on supported database connectors.
- Supported Connectors: OvalEdge supports various connectors to interact with different data sources. The connector support depends on the data object:
- For Table and Table Column related functions, OvalEdge supports connectors such as MySQL, SQL Server, Oracle, PostgreSQL, Snowflake, and more. These connectors enable seamless interaction with the respective databases to perform data quality operations.
- For File and File Column related functions, OvalEdge supports connectors like S3 (Amazon Simple Storage Service), NFS (Network File System), and Azure Data Lake Storage. These connectors enable reading and processing files stored in these storage systems.
- Supported File Formats: OvalEdge supports multiple file formats for File and File Column related functions. Some supported file formats include:
- .csv (Comma-Separated Values): This format is commonly used for tabular data storage, where data values are separated by commas.
- .json (JavaScript Object Notation): This format is used for structured data representation in a human-readable format using key-value pairs.
- .parquet: This columnar storage file format is optimized for efficient data processing and analytics.
Example:
Consider a scenario where you have a table named "SalesData" with columns such as "ProductID", "Quantity", and "Price". You want to calculate the total revenue generated by each product. OvalEdge provides a Data Quality Function called "Sum(TC)" that can be applied to the "Price" column of the "SalesData" table.
- Function: Sum (TC)
- Data Object: Table Column (TC)
- Connector: MySQL
- Table: SalesData
- Column: Price
- Input: Success Range: 100 to 200.
- Output: The output of the function can be one of the following:
- Success: If the computed value falls within the specified range, the rule is considered successful.
- Failure: If the computed value is outside the specified range, the rule is considered a failure.
- Undetermined: If the function encounters any issues or constraints that prevent it from executing properly, the result is undetermined. For example, if the selected table column is empty and does not contain any data.
- Not Executed: This status indicates that the rule has not been executed.
- Execution Failed: This status indicates that the execution of the rule has failed due to an error or unexpected behavior, such as encountering issues with connecting to the data source.
By executing the "Sum (TC)" function on the "Price" column of the "SalesData" table, OvalEdge will compute the sum of all the prices. This will give you the total revenue generated.
Similarly, you can leverage other Data Quality Functions provided by OvalEdge to perform a wide range of computations and validations on your data, ensuring data quality and accuracy.
OvalEdge provides a comprehensive list of 57 predefined Data Quality Functions categorized based on data object type and default dimension. These functions encompass a wide range of data quality operations, enabling users to perform calculations and validations effectively.
Object |
Default Dimension |
Data Quality Function |
Tables |
Completeness |
Null Density Percent (TB) |
Uniqueness |
Density Percent (TB) |
|
Validity |
Total Row Count (TB) |
|
Table Columns |
Completeness |
Null Density Percent (TC) |
Empty Or Null Count Percent (TC) |
||
Not Null And Not Empty Percent (TC) |
||
Uniqueness |
Density Percent (TC) |
|
Validity |
Validate Email Percent (TC) |
|
Validate Regex Percent (TC) |
||
Validate Pattern Percent (TC) |
||
Outlier Rule (TC) |
||
Is Unique Column (TC) |
||
Empty Count Range (TC) |
||
Not Empty Count Range (TC) |
||
Data Length Range (TC) |
||
Validate SSN Format (TC) |
||
Validate Credit Card Format (TC) |
||
Ends With Symbols (TC) |
||
Average (TC) |
||
Distinct (TC) |
||
Min (TC) |
||
Max (TC) |
||
Null Count Range (TC) |
||
Std Deviation (TC) |
||
Sum (TC) |
||
Top Values (TC) |
||
Files |
Validity |
File Name Contains (FL) |
File Format Validation (FL) |
||
File Created Date (FL) |
||
File Size Validation (FL) |
||
File Row Validation (FL) |
||
File Columns |
Validity |
All Column Values Matches Given Fixed Value (FC) |
First Letter Of Word Upper Case And Rest Lower Case (FC) |
||
This Column Value Should Be Unique (FC) |
||
Word Validation (FC) |
||
Number Validation (FC) |
||
Date Validation (FC) |
||
Numeric Value Between Min And Max Values (FC) |
||
Value Validation (FC) |
||
Allow Only Specified Special Character (FC) |
||
Has Length Of The Value With In Range (FC) |
||
Columns Unique Percent Value (FC) |
||
Outliers To Aggregations (FC) |
||
Has No Null Value (FC) |
||
Has No Digits or Special Chars (FC) |
||
Has Digits and Special Chars (FC) |
||
No Leading or Trailing Spaces (FC) |
||
Has Substring in String (FC) |
||
Belongs To One Of Languages (FC) |
||
Word Count With In Range (FC) |
||
Column Value Same As Other Column Value (FC) |
||
Regex Match (FC) |
||
Column Value Matches One Of Given Values (FC) |
||
Query |
Uniqueness |
Sql Exact Value (SQL) |
Validity |
Custom Sql Exact Value (SP) |
|
Sql Value Contains (SQL) |
||
Sql Value Range (SQL) |
Data Quality Functions for Tables
Null Density Percent (TB)
Name |
Null Density Percent (TB) |
||||||||||||||||||||||||
Description |
The Null Density Percent(TB) function is a data quality function used to calculate the percentage of null values in a table. It determines the count of null values across all table columns and expresses it as a percentage of the total values count. The purpose of this function is to assess the presence of null values within a table and evaluate whether the null density falls within a specified percentage range. By defining a desired percentage range, you can establish the rule's success criteria. |
||||||||||||||||||||||||
Formula |
NULL DENSITY % = (COUNT OF NULL VALUES IN ALL COLUMNS / (COUNT ALL ROWS X COUNT OF COLUMNS)) * 100 |
||||||||||||||||||||||||
Supported Object Type |
The Null Density Percent(TB) function operates on tables. It allows you to evaluate the null density percentage in a table. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the null density percentage should fall. It requires two values, the lower and upper ranges, expressed as percentages between 0 and 100. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Let's consider an example to illustrate the usage of the Null Density Percent(TB) function. We have a table named "LOCATION" as shown below:
Suppose we set the input success range to be 0 to 50. In this case, the NULL values count is 8, and the total count is 21. To determine the null density percentage, we divide the count of NULL values (8) by the total count (21) and multiply by 100 Null Density Percent is calculated as (8 / 21) * 100 = 38%. Since the computed value of 38% falls within the specified range of 0 to 50, the rule is considered Successful. |
Density Percent (TB)
Name |
Density Percent (TB) |
|||||||||||||||||||||||||||
Description |
The Density Percent(TB) function is a data quality function used to determine the percentage of distinct values in a table. It calculates the sum of distinct values count in each column of a table and returns the value as a percentage of the total values count and expresses it as a percentage. The purpose of this function is to assess the presence of distinct values within a table and evaluate whether the density falls within a specified percentage range. By setting a desired percentage range, you can define the success criteria for the rule. |
|||||||||||||||||||||||||||
Formula |
DENSITY % = ((SUM OF COUNT OF DISTINCT VALUES IN EACH COLUMN) / (COUNT OF ALL ROWS X COUNT OF ALL COLUMNS)) * 100 |
|||||||||||||||||||||||||||
Supported Object Type |
The Density Percent(TB) function operates on tables. It allows you to evaluate the density percentage specifically across the columns within a table. |
|||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
|||||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the density percentage should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
|||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
|||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||
Let's consider an example to illustrate the usage of the Density Percent(TB) function. We have a table named "LOCATION" shown below:
Suppose we set the input success range to be 0 to 100. In this case, the Sum of the Count of Distinct values in each column comes to 5 + 7 + 6 = 18 To determine the density percentage, we divide the count of all rows and columns values (8 X 3) by the total count of distinct values (18) and multiply by 100: Density Percent is calculated as (18 / (8x3)) * 100 = 75%. Since the computed value of 75% falls within the specified range of 0 to 100, the rule is considered Successful. |
Total Row Count (TB)
Name |
Total Row Count (TB) |
|||||||||||||||||||||||||||||||||
Description |
The Total Row Count(TB) function determines the total number of rows in a table. It provides a straightforward way to obtain the count of rows present in the specified table. The rule's success is determined by checking if the computed value falls within the specified input range. By defining a range of acceptable values, you can establish criteria for successful rule execution. |
|||||||||||||||||||||||||||||||||
Formula |
There is no specific formula involved in this function. The Total Row Count(TB) function simply counts the number of rows in the table. |
|||||||||||||||||||||||||||||||||
Supported Object Type |
The Total Row Count(TB) function operates specifically on tables. It allows you to determine the total number of rows in a table. |
|||||||||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
|||||||||||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the total row count value should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
|||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
|||||||||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||||||||
To illustrate the usage of the Total Row Count (TB) function, let's consider a table named "EMPLOYEE" that contains 21 rows. Table: EMPLOYEE
Suppose we set the input success range to be 10 to 30. In this case, the computed row count is 10, which falls within the specified range. Therefore, the rule is Successful. |
Data Quality Functions for Table Columns
Null Density Percent (TC)
Name |
Null Density Percent (TC) |
||||||||||||||||||||||||
Description |
The Null Density Percent(TC) function is a data quality function used to determine the percentage of null values in a specific column of a table. It calculates the count of null values in the specified column and expresses it as a percentage of the total values in that column. The purpose of this function is to assess the presence of null values within a specific column. It also evaluates whether the null density falls within a specified percentage range. By defining a desired percentage range, you can establish the rule's success criteria. |
||||||||||||||||||||||||
Formula |
NULL DENSITY (%) IN THE TABLE COLUMN = (NUMBER OF NULL VALUES IN THE TABLE COLUMN / TOTAL NUMBER OF VALUES IN THE TABLE COLUMN) * 100 |
||||||||||||||||||||||||
Supported Object Type |
The Null Density Percent(TC) function operates specifically on table columns. It allows you to evaluate the null density percentage in a specific column of a table. |
||||||||||||||||||||||||
Supported Data Types |
The Null Density Percent(TC) function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the null density percentage should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Let's consider a table column "COUNTRY" with the following values:
Suppose we set the input success range to be 0 to 50. In this case, there are 3 NULL values out of 7 total values in the "COUNTRY" column. Hence, the Null Density Percent is calculated as (3 * 100) / 7 = 42%. Since the computed null density percentage of 42% falls within the specified range, the rule is considered Successful. |
Empty Or Null Count Percent (TC)
Name |
Empty Or Null Count Percent (TC) |
||||||||||||||||||||||||
Description |
The Empty Or Null Count Percent(TC) function is a data quality function used to calculate the percentage of empty or null values in a table column. It determines the proportion of empty or null values among the total values in the column and expresses it as a percentage. The purpose of this function is to assess the presence of empty or null values within a column and evaluate whether the count falls within a specified percentage range. By setting a desired range, you can define the success criteria for the rule. |
||||||||||||||||||||||||
Formula |
EMPTY OR NULL COUNT PERCENT (%) = (COUNT OF EMPTY OR NULL VALUES IN THE COLUMN / TOTAL COUNT OF VALUES IN THE COLUMN) * 100 |
||||||||||||||||||||||||
Supported Object Type |
The Empty Or Null Count Percent(TC) function operates on table columns. It allows you to evaluate the percentage of empty or null values in a specific column. |
||||||||||||||||||||||||
Supported Data Types |
This function supports the following data types within the table column: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the empty or null count percentage should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Let's consider an example to illustrate the usage of the Empty Or Null Count Percent(TC) function. We have a table column named "COUNTRY" with the following values:
In this example, the column has 2 empty values and 2 NULL values out of 7 values. To calculate the empty or null count percent, we divide the count of empty or null values (4) by the total count of values (7) and multiply by 100 Empty Or Null Count Percent is calculated as (4 / 7) * 100 = 57%. Suppose we set the input success range for this rule to 0 to 20. In this case, the computed empty or null count percent is 57%. Since the computed value of 57% is not within the specified range, the rule is considered a Failure. |
Not Null And Not Empty Percent (TC)
Name |
Not Null And Not Empty Percent (TC) |
||||||||||||||||||||||||
Description |
The Not Null And Not Empty Percent(TC) function is a data quality function used to calculate the percentage of not empty and non-null values in a table column. It determines the proportion of values that are not empty or null out of the total values in the column and expresses it as a percentage. The purpose of this function is to assess the presence of not empty and non-null values within a column and evaluate whether the count falls within a specified percentage range. By setting a desired range, you can define the success criteria for the rule. |
||||||||||||||||||||||||
Formula |
NOT NULL AND NOT EMPTY PERCENT (%) = (COUNT OF NOT EMPTY AND NON-NULL VALUES IN THE COLUMN / TOTAL COUNT OF VALUES IN THE COLUMN) * 100 |
||||||||||||||||||||||||
Supported Object Type |
The Not Null And Not Empty Percent(TC) function operates on table columns. It allows you to evaluate the percentage of values that are not empty and non-null in a specific column. |
||||||||||||||||||||||||
Supported Data Types |
This function supports the following data types within the table column: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the not null and not empty count percentage should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Let's consider an example to illustrate the usage of the Not Null And Not Empty Percent(TC) function. We have a table column named "COUNTRY" with the following values:
In this example, the column has 2 empty values and 2 NULL values out of a total of 7 values. To calculate the not null and not empty count percent, we divide the count of not empty and non-null values (3) by the total count of values (7) and multiply by 100 Not Null And Not Empty Percent is calculated as (3 / 7) * 100 = 43%. Suppose we set the input success range for this rule to be 80 to 100. In this case, the computed not null and not empty count percent is 43%. Since the computed value of 43% is not within the specified range, the rule is considered a Failure. |
Density Percent (TC)
Name |
Density Percent (TC) |
|||||||||||||||||||||||||||||||||
Description |
The Density Percent(TC) function is a data quality function used to determine the extent of unique values (or conversely, repeated values) in a table column and represents it as a percentage. It measures the density or uniqueness of data within a column. The rule is considered successful when the computed percentage falls within the specified input range. This allows you to define the acceptable range for the uniqueness of values in the column. |
|||||||||||||||||||||||||||||||||
Formula |
DENSITY PERCENT (%) = (COUNT OF DISTINCT VALUES IN THE COLUMN / TOTAL COUNT OF VALUES IN THE COLUMN) * 100 |
|||||||||||||||||||||||||||||||||
Supported Object Type |
The Density Percent(TC) function operates on table columns. It allows you to assess the uniqueness of values within a specific column. |
|||||||||||||||||||||||||||||||||
Supported Data Types |
This function supports the following data types within the table column: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
|||||||||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
|||||||||||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the density percentage should fall. It requires two values, the lower and upper bounds, expressed as percentages between 0 and 100. |
|||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
|||||||||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||||||||
Let's consider an example to illustrate the usage of the Density Percent(TC) function. We have a table column named "DEPTNO" with the following values:
In this example, the column has 7 distinct values out of a total of 10 values. To calculate the density percentage, we divide the count of distinct values (7) by the total count of values (10) and multiply by 100 Density Percent is calculated as (7 / 10) * 100 = 70%. Suppose we set the input success range for this rule to be 50 to 100. In this case, the computed density percentage is 70%. Since the computed value of 70 falls within the specified range, the rule is considered Successful. |
Validate Email Percent (TC)
Name |
Validate Email Percent (TC) |
|||||||||||||||||||||||||||||||||
Description |
The Validate Email Percent (TC) function is used to evaluate the percentage of valid email values within a specified column. It checks whether the email values conform to the standard pattern of "username@companydomain.com" or "username@companydomain.in". |
|||||||||||||||||||||||||||||||||
Formula |
Validate Email Percent = (Passed Records Count / Total Records Count) * 100 |
|||||||||||||||||||||||||||||||||
Supported Object Type |
The Validate Email Percent(TC) function operates on table columns, allowing you to assess the validity of email values within a specific column. |
|||||||||||||||||||||||||||||||||
Supported Data Types |
The Validate Email Percent (TC) function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string. |
|||||||||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
|||||||||||||||||||||||||||||||||
Input |
Percentage (%) Range: Enter the desired range between 0 and 100 to define the success criteria. |
|||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
|||||||||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||||||||
Consider a table column named "EMAIL" with the following values:
Suppose we set the input success range to be 80 to 100. Let's consider a table column "Email" that contains 10 email addresses. Out of these, 9 email addresses pass the validation test, while 1 email addresses fail the validation. Validate Email Percent = (Passed Records Count / Total Records Count) * 100 = (9 / 100) * 100 = 90%. In this case, the computed value of 90% falls within the specified range, indicating a successful email validation. |
Validate Regex Percent (TC)
Name |
Validate Regex Percent (TC) |
||||||||||||||||||||||||||||||||||||||||||||
Description |
The Validate Regex Percent(TC) function is used to validate if the values within a table column match a specified regular expression (regex) pattern, such as ^[A-Za-z0-9]*$. It calculates the percentage of matched values in relation to the total number of values in the column. The purpose of this function is to determine the success rate of regex validation by comparing the computed percentage with a specified input range. |
||||||||||||||||||||||||||||||||||||||||||||
Formula |
VALIDATE REGEX PERCENT = (PASSED OBJECTS COUNT / TOTAL OBJECTS COUNT) * 100 |
||||||||||||||||||||||||||||||||||||||||||||
Supported Object Type |
The Validate Regex Percent(TC) function operates on table columns, allowing you to validate the values within a specific column against a regex pattern. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Data Types |
This function supports the following data types within the table column: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||||||||||||||||||||||
Input |
Percentage (%) Range: Enter the range between 0 and 100 to define the acceptable percentage range. Pattern: Enter the specific pattern (e.g.,^[A-Za-z0-9]*$) that the values should adhere to for successful validation. |
||||||||||||||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||||||||||||||||||||||
Let's consider a table column "NOTES" containing a set of values on which the Validate Regex Percent rule is applied.
The regular expression pattern ^[A-Za-z0-9]*$ can be broken down as follows:
Putting it all together, the pattern ^[A-Za-z0-9]*$ will match any string that consists of zero or more uppercase letters, lowercase letters, or digits. It will not match strings containing other characters or special symbols. Example: "Hello" or “HELLO” or “Hello123”This input string consists of only uppercase letters, lowercase letters, and digits. It matches the pattern because it contains zero or more characters that fall within the range specified by the character class [A-Za-z0-9]. Example: "889976545"This input string consists of only digits. It matches the pattern because it contains zero or more characters that fall within the range specified by the character class [A-Za-z0-9]. Example: "Hello OvalEdge!" or “Hello 234!” or “HELLO-234&”This input string contains a space, dash, and an exclamation mark, which are not included in the character class [A-Za-z0-9]. It does not match the pattern because it fails to meet the requirement of having only uppercase letters, lowercase letters, and digits. Example: ""This input string is an empty string. It matches the pattern because it contains zero characters, which satisfies the condition of zero or more characters specified by the * quantifier. The Passed Objects Count: 7 (values that match the specified regex pattern) The Failed Objects Count: 3 (values that do not match the specified regex pattern) Validate Regex Percent = (Passed Objects Count / Total Objects Count) * 100 = (7 / 10) * 100 = 70% Suppose we set the input success range to be 50 to 100. In this case, the computed value of 70% falls within the specified range, indicating Successful validation. |
Validate Pattern Percent (TC)
Name |
Validate Pattern Percent (TC) |
||||||||||||||||||||||||||||||||||||||||||||
Description |
The Validate Pattern Percent(TC) function is a data validation function used to assess the matching of values in a table column to a specified pattern. It calculates the percentage of values that match the given pattern and determines whether it falls within the desired range. The rule is considered successful when the computed percentage of matching values is within the specified input range. |
||||||||||||||||||||||||||||||||||||||||||||
Formula |
VALIDATE PATTERN PERCENT = (PASSED OBJECTS COUNT / TOTAL OBJECTS COUNT) * 100 |
||||||||||||||||||||||||||||||||||||||||||||
Supported Object Type |
Table Columns: The function can be applied to columns within a table for data validation. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Data Types |
The Validate Pattern Percent(TC) function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||||||||||||||||||||||
Input |
Percentage (%) Range: Enter the range between 0 and 100 to define the acceptable percentage range. Pattern: Enter the specific pattern (e.g., U, L, D) that the values should adhere to for successful validation. |
||||||||||||||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||||||||||||||||||||||
Consider a table column named "NOTES" with the following values:
The pattern value "ULD" is not a regular expression pattern itself but a string consisting of three characters: 'U', 'L', and 'D'. Each character represents a specific character class within the pattern. Let's break it down:
When combined, the pattern value "ULD" can be interpreted as a pattern that expects a string consisting of an uppercase letter, followed by a lowercase letter, and ending with a digit. Example: "Hello" or “HELLO” or “Hello123” or "889976545"This input string consists of only uppercase letters, lowercase letters, and digits. It matches the pattern because it contains the characters that fall within the range specified by the character class “ULD”. Example: "Hello OvalEdge!" or “Hello 234!” or “HELLO-234&”This input string contains a space, dash, and an exclamation mark, which are not included in the character class “ULD”. It does not match the pattern because it fails to meet the requirement of having only uppercase letters, lowercase letters, and digits. The Passed Objects Count: 7 (values that match the specified regex pattern) The Failed Objects Count: 3 (values that do not match the specified regex pattern) Validate Pattern Percent = (Passed Objects Count / Total Objects Count) * 100 = (7 / 10) * 100 = 70% Suppose we set the input success range to be 50 to 100. In this case, the computed value of 70% falls within the specified range, indicating Successful validation. |
Outlier Rule (TC)
Name |
Outlier Rule (TC) |
||||||||||||||||||||||||
Description |
The Outlier Rule(TC) function is a data validation function used to identify extreme values within a table column. It aims to determine whether all values in the column fall within a specified input range. The rule is considered successful when all values in the table column are within the specified range. If any value falls outside the range, the rule is considered failed. |
||||||||||||||||||||||||
Formula |
There is no specific statistical formula for the Outlier Rule(TC) function, as it primarily involves checking whether each value in the column falls within the specified range. |
||||||||||||||||||||||||
Supported Object Type |
The function can be applied to columns within a table for data validation. |
||||||||||||||||||||||||
Supported Data Types |
The Outlier Rule(TC) function supports a range of numeric data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the extreme values should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "DAILY_SALES" with the following values:
Let's assume that the outlier rule was applied to the "DAILY_SALES" column. The input success range specified is 500 to 10000. By comparing each value in the column to the specified range, we find that two values, 25 and 57, fall outside the range. Consequently, the Outlier Rule(TC) function fails for these values. The rule provides the following information in case of failure:
|
Is Unique Column (TC)
Name |
Is Unique Column(TC) |
||||||||||||||||||||
Description |
The Is Unique Column(TC) function is designed to determine whether all values in a table column are unique. Its purpose is to check if there are any duplicate values present in the column. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Is Unique Column(TC) function can be applied to table columns, allowing you to determine the uniqueness of values within a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
Not Applicable |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "EMP_ID" with the following values:
In this example, the value 1002 is repeated, indicating that the column contains duplicates. Therefore, the Is Unique Column(TC) function fails for this column. The output for this example would be as follows:
|
Empty Count Range (TC)
Name |
Empty Count Range (TC) |
||||||||||||||||||||||||
Description |
The Empty Count Range(TC) function is designed to determine the number of empty values in a table column. Its purpose is to check if the count of empty values falls within a specified input range. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Empty Count Range(TC) function supports table columns. It allows you to determine the count of empty values within a specific column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the count of empty values should fall. You need to specify the lower and upper bounds as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "DAILY_SALES" with the following values:
Let's set the input success range for the "Empty Count Range(TC)" function to 0 to 3. By applying the function to the "DAILY_SALES" column, we find that the computed count of empty values is 3, which falls within the specified range. Therefore, the rule is considered Successful. The output for this example would be as follows:
|
Not Empty Count Range (TC)
Name |
Not Empty Count Range (TC) |
||||||||||||||||||||||||
Description |
The Not Empty Count Range(TC) function is designed to determine the number of non-empty values in a table column. Its purpose is to check if the count of non-empty values falls within a specified input range. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Not Empty Count Range(TC) function can be applied to table columns, allowing you to determine the count of non-empty values within a specific column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the count of not empty values should fall. You need to specify the lower and upper bounds as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "DAILY_SALES" with the following values:
Let's set the input success range for the "Not Empty Count Range (TC)" function to 10 to 10000. By applying the function to the "DAILY_SALES" column, we find that the computed count of non-empty values is 4, which is not within the specified range of 10 to 1000. Therefore, the Not Empty Count Range(TC) function fails for this column. The output for this example would be as follows:
|
Data Length Range (TC)
Name |
Data Length Range (TC) |
||||||||||||||||||||
Description |
The Data Length Range (TC) function is designed to determine the length of values in a table column. Its purpose is to check if the computed value falls within a specified input range. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Data Length Range (TC) function supports table columns. It allows you to determine the length of values within a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the length of values should fall. You need to specify the lower and upper bounds as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "NOTES" with the following string values:
Let's set the input success range for the "Data Length Range (TC)" function to 10 to 50. By applying the function to the "NOTES" column, the function computes the length of each value in the column and compares it against the specified input success range. For each value in the "NOTES" column, we calculate its length:
Since all the values in the "NOTES" column have lengths within the specified range of 10 to 50 characters, the Data Length Range(TC) function is Successful for this column. |
Validate SSN Format (TC)
Name |
Validate SSN Format (TC) |
||||||||||||||||||||
Description |
The Validate SSN Format (TC) function is used to validate the format of Social Security Numbers (SSN) in a table column. It checks if all values in the column match the pattern "XXX-XX-XXXX", where X represents a digit, and converts the matched values to a percentage. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Validate SSN Format (TC) function can be applied to table columns, specifically those containing SSN values. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, and string. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the percentage of SSN values should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "SSN" with the following values:
Suppose we set the input success range to be 80 to 100. Let's apply the Validate SSN Format(TC) function to the "SSN" column. By validating the format of each SSN value, we observe the following:
The output for this example would be as follows:
In this example, one out of the four SSN values match the specified format, resulting in a success percentage of 25%. Percentage = (Passed Count / Total Count) * 100 = 1/4 * 100 = 25% Since the success percentage falls outside the specified range of 80 to 100, the rule is considered a Failure. |
Validate Credit Card Format (TC)
Name |
Validate Credit Card Format (TC) |
||||||||||||||||||||
Description |
The Validate Credit Card Format (TC) function is used to validate the format of values in a table column based on a specific pattern. It checks if the values match the pattern "XXXX XXXX XXXX XXXX" (e.g., four sets of four digits separated by spaces), and then calculates the percentage of values that match the format. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Validate Credit Card Format (TC) function can be applied to table columns, allowing you to validate the format of credit card numbers within a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, and string. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the percentage of Credit Card values should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "CREDITCARD" with the following values:
Suppose we set the input success range to be 80 to 100. Let's apply the Validate Credit Card Format (TC)function to the "CREDITCARD" column. By validating the format of each CREDIT CARD value, we observe the following:
The output for this example would be as follows:
In this example, two out of the four Credit Card values match the specified format, resulting in a success percentage of 50%. Percentage = (Passed Count / Total Count) * 100 = 2/4 * 100 = 50% Since the success percentage falls outside the specified range of 80 to 100, the rule is considered a Failure. |
Ends With Symbol (TC)
Name |
Ends With Symbol (TC) |
|||||||||||||||
Description |
The Ends With Symbol (TC) function is used to validate whether all the values in a table column end with a specified symbol. It checks if every value in the column satisfies this condition. |
|||||||||||||||
Formula |
The formula for this function is not applicable, as it does not involve any numerical calculations or percentages. |
|||||||||||||||
Supported Object Type |
The Ends With Symbol (TC) function can be applied to table columns, allowing you to validate the ending character of values within a specific column of a dataset. |
|||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, decimal, varchar, double, float, String, and smallint. |
|||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
|||||||||||||||
Input |
The symbol entered in the "Success Value" field, represents the desired ending character (eg. @) |
|||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
|||||||||||||||
Example |
||||||||||||||||
Consider a table column named "EMAIL" with the following values:
Suppose we set the input success range to be 80 to 100. Let's apply the Ends With Symbol(TC) function to the "EMAIL" column with the symbol "@" as the desired ending character. By checking the ending character of each value, we observe the following:
The output for this example would be as follows:
Since all the values in a table column end with a specified symbol, the rule is considered as Successful. |
Average (TC)
Name |
Average (TC) |
||||||||||||||||||||
Description |
The Average (TC) function is used to determine the average of all the values in a table column. It calculates the mean value by summing up all the values in the column and dividing it by the total count of values. |
||||||||||||||||||||
Formula |
Average = Sum of all values / Total count of values |
||||||||||||||||||||
Supported Object Type |
The Average (TC) function can be applied to table columns, allowing you to calculate the average of numeric values within a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the average value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "SCORE" with the following numeric values:
Suppose we set the input success range to be 50 to 100. Let's apply the Average (TC) function to the "SCORE" column. By calculating the average of all the values, we can determine if it falls within the specified range. In this example, the average is computed as follows: Average = (60 + 20 + 50 + 80) / 10 = 210 / 4 = 52 Since the computed average value of 52 falls within the specified success range of 50 to 100, the rule is considered Successful. The output for this example would be as follows:
|
Distinct (TC)
Name |
Distinct (TC) |
||||||||||||||||||||
Description |
The Distinct (TC) function can be applied to table columns, allowing you to count the number of unique values within a specific column of a dataset. |
||||||||||||||||||||
Formula |
The statistical formula for calculating the distinct count is simply counting the number of unique values in the column. |
||||||||||||||||||||
Supported Object Type |
The Distinct (TC) function can be applied to table columns, allowing you to count the number of unique values within a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the count of distinct values should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "SALARY" with the following numeric values:
Suppose we set the input success range to be 5 to 10. Let's apply the Distinct(TC) function to the "SALARY" column. By counting the distinct or unique values, we can determine if it falls within the specified range. In this example, there are 3 unique values in the "SALARY" column: Distinct Count = 3 Since the computed distinct count value of 3 falls within the specified success range of 5 to 10, the rule is considered Successful. The output for this example would be as follows:
|
Min (TC)
Name |
Min (TC) |
||||||||||||||||||||||||
Description |
The Min (TC) function is used to determine the minimum value of a table column. It calculates the smallest value present in the column. |
||||||||||||||||||||||||
Formula |
The statistical formula for calculating the minimum value is finding the smallest value in the column. |
||||||||||||||||||||||||
Supported Object Type |
The Min (TC) function can be applied to table columns, allowing you to find the minimum value within a specific column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the minimum value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "DAILY_SALES" with the following values:
Suppose we set the input success range to 500 to 10000. Let's apply the Min (TC) function to the "DAILY_SALES" column. By finding the minimum value, we can determine if it falls within the specified range. In this example, the minimum value in the "DAILY_SALES" column is 250. Minimum Value = 250 Since the computed minimum value of 250 is outside the specified success range of 500 to 10000, the rule is considered Successful. The output for this example would be as follows:
|
Max (TC)
Name |
Max (TC) |
||||||||||||||||||||||||
Description |
The Max (TC) function is used to determine the maximum value in a table column (TC). It calculates the largest value present in the column. |
||||||||||||||||||||||||
Formula |
The statistical formula for calculating the maximum value is finding the largest value in the column. |
||||||||||||||||||||||||
Supported Object Type |
The Max (TC) function can be applied to table columns, allowing you to find the maximum value within a specific column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the maximum value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "DAILY_SALES" with the following values:
Suppose we set the input success range to 500 to 10000. Let's apply the Max (TC) function to this table column. By finding the maximum value, we can determine if it falls within the specified range. In this example, the maximum value in the column is 5000. Maximum Value = 5000 Since the computed maximum value of 5000 falls within the specified success range of 500 to 10000, the rule is considered Successful. The output for this example would be as follows:
|
Null Count Range (TC)
Name |
Null Count Range (TC) |
||||||||||||||||||||
Description |
The Null Count Range (TC) function is used to determine the count of null values in a table column (TC). It calculates the number of null values present in the column. |
||||||||||||||||||||
Formula |
The statistical formula for calculating the null count is simply counting the number of null values in the column. |
||||||||||||||||||||
Supported Object Type |
The Null Count Range(TC) function can be applied to table columns, allowing you to find the count of null values within a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the null count value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "EMPLOYEEID" with the following numeric values:
Suppose we set the input success range to 0 to 10. Let's apply the Null Count Range(TC) function to this table column. By counting the null values, we can determine if the count falls within the specified range. In this example, the null count in the column is 3. Null Count = 3 Since the computed null count of 3 falls within the specified success range of 0 to 10, the rule is considered Successful. The output for this example would be as follows:
|
Std Deviation (TC)
Name |
Std Deviation (TC) |
||||||||||||||||||||
Description |
The Std Deviation (TC) function is used to determine the standard deviation of the values in a table column (TC). The standard deviation measures the variation or spread of data around the mean. A low standard deviation indicates that the data points are closely clustered around the average, while a high standard deviation indicates a larger spread between the data points and the average. |
||||||||||||||||||||
Formula |
The formula for calculating the standard deviation is as follows: Standard Deviation = sqrt( sum( (x - mean)^2 ) / N ) Where:
|
||||||||||||||||||||
Supported Object Type |
The Std Deviation(TC) function can be applied to table columns, allowing you to calculate the standard deviation of a specific column of a dataset. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||
Input |
This input allows you to define the desired range within which the standard deviation value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a table column named "SCORE" with the following numeric values:
Suppose we set the input success range to 0 to 500. Let's apply the Std Deviation (TC) function to this table column. By calculating the standard deviation of the values, we can determine if it falls within the specified range. In this example, the average of the dataset is 450, and the standard deviation is approximately 260. Standard Deviation ≈ 260 Let's assume the input success range specified is 0 to 500. Since the computed standard deviation of 260 falls within the specified range, the rule is considered Successful. The output for this example would be as follows:
Calculation: To calculate the standard deviation, we follow these steps: Calculate the mean (average) of the values: Mean = (600 + 200 + 200 + 800) / 4 = 450 Subtract the mean from each value and square the result:
Calculate the sum of the squared differences: Sum = 22,500 + 62,500 + 62,500 + 122,500 = 270,000 Divide the sum by the total count of values (N): Standard Deviation = sqrt(Sum / N) Note: sqrt indicates Square Root. In this case, N = 4 (total count of values), so: Standard Deviation = sqrt(270,000 / 4) = sqrt(67,500) ≈ 259.807 Therefore, the standard deviation of the "SCORE" column is approximately 260. In this example, the calculated standard deviation of approximately 260 indicates a relatively large spread in score values, as the values deviate significantly from the mean of $450. |
Sum (TC)
Name |
Sum (TC) |
||||||||||||||||||||||||
Description |
The Sum (TC) function is a rule that calculates the sum of the values in a column. It is commonly used in data analysis and validation processes to assess the total numerical value within a dataset. |
||||||||||||||||||||||||
Formula |
The formula for the Sum(TC) function is as follows: Sum = Σ (values) where Σ represents the summation symbol, and "values" refers to the numeric values in the table column. |
||||||||||||||||||||||||
Supported Object Type |
The Sum (TC) function can be applied to table columns, allowing you to calculate the sum of values within a specific column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the sum value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "SCORE" with the following numeric values:
Suppose we set the input success range to 100 to 200. Let's consider a table column called "SCORE" with the following values: [20, 40, 30, 25, 35]. We will apply the Sum (TC) function to calculate the sum of these values. Sum = 20 + 40 + 30 + 25 + 35 = 150 As the input success range specified is 100 to 200, and the computed sum is 150, the rule will be considered Successful since the sum falls within the specified range. The output for this example would be as follows:
|
Top Values (TC)
Name |
Top Values (TC) |
||||||||||||||||||||||||
Description |
The Top Values (TC) function is used to determine the top values in a column. It evaluates whether all the specified values exist within the computed top 50 values of the column. This function helps in assessing if certain values are among the highest in the dataset and can be used for validation or comparison purposes. |
||||||||||||||||||||||||
Formula |
The Top Values (TC) function compares the specified values with the computed top 50 values of the column to check for their presence. |
||||||||||||||||||||||||
Supported Object Type |
The Top Values (TC) function can be applied to table columns, allowing you to check if specific values are among the top values in a column of a dataset. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
RDBMS: MySQL, SQL Server, Oracle, PostgreSQL, Snowflake and more. |
||||||||||||||||||||||||
Input |
This input allows you to define the desired range within which the top value should fall. You need to specify the lower and upper ranges as numeric values. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a table column named "SCORE" with the following numeric values:
Suppose we set the input success range to top value-98, top value-45, top value-78 Consider a table column named "SCORE" with the following top values: [100, 89, 78, 94, 99]. Let's assume we want to check if the values [98, 45, 78] exist among the top values of the column. By comparing the specified values with the computed top 50 values, we observe the following: Specified values: [98, 45, 78] Computed top values: [100, 99, 94, 89, 78] In this example, the specified values 98 and 45 are not present among the top values of the column. Therefore, the Top Values (TC) rule fails. The output for this example would be as follows:
|
Data Quality Functions for Files
File Name Contains (FL)
Name |
File Name Contains (FL) |
Description |
The File Name Contains (FL) function is used to determine if the specified input values exist in the file name. It checks whether all the specified values are present within the file name, allowing you to validate the presence of specific strings or patterns. |
Formula |
The File Name Contains (FL) function searches for the presence of the specified input values within the file name. |
Supported Object Type |
The File Name Contains (FL) function is applicable to files. It allows you to check if specific values are present within the file name. |
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
Input |
The File Name Contains (FL) function requires the following inputs: Alphanumeric Value: Enter one or more alphanumeric strings separated by commas. These strings represent the values you want to check for in the file name. Eg. filenamecontains-profile, filenamecontains-sheet |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
Example |
|
Suppose we set the input value as filenamecontains-profile, filenamecontains-sheet Let's apply the File Name Contains (FL) function and check if the file names contain the strings "profile" and "sheet". File name: "Emp_profile.csv" Input strings: "profile", "sheet" In this example, the input string "profile" is found within the file name. Therefore, the File Name Contains(FL) rule is Successful. The output for this example would be as follows:
|
File Format Validation (FL)
Name |
File Format Validation (FL) |
Description |
The File Format Validation(FL) function is used to validate the format of a file uploaded into the OvalEdge application. It ensures that the file format matches the specified format requirements. Currently, the supported file formats are "csv", "json" and “parquet”. |
Formula |
The File Format Validation (FL) function checks if the file format of the uploaded file matches the specified format. |
Supported Object Type |
The File Format Validation (FL) function is applicable to files uploaded into the OvalEdge application. It allows you to validate the format of the uploaded files. |
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
Input |
Enter the desired file format that the uploaded file should adhere to. Eg. .csv, .json |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
Example |
|
Suppose we set the input value as .csv, .json Let's consider a scenario where a user uploads a file named "test_profile.csv" into the OvalEdge application. The desired file format for this case is ".csv". Uploaded file: "test_profile.csv" Desired file format: ".csv" By comparing the file format of the uploaded file with the specified format, we can determine if the rule is successful: In this example, the file format of the uploaded file "test_profile.csv" matches the specified format ".csv". Therefore, the File Format Validation (FL) rule is Successful. The output for this example would be as follows:
|
File Created Date (FL)
Name |
File Created Date (FL) |
Description |
The File Created Date (FL) function is used to validate if the creation date of a file falls within a specified date range. This function helps determine whether the file was profiled or cataloged within the given timeframe. |
Formula |
The File Created Date (FL) function compares the creation date of the file with the specified lower and upper date range to check if it falls within that range. |
Supported Object Type |
The File Created Date (FL) function can be applied to files, allowing you to validate the creation date of a specific file. |
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
Input |
The File Created Date (FL) function requires the following inputs: Parameter 1: Lower date range - This parameter specifies the start date and time of the range. Parameter 2: Upper date range - This parameter specifies the end date and time of the range. |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
Example |
|
Consider a file with the following details:
Let's assume we apply the File Created Date(FL) function to this file and specify the date range as follows:
By comparing the creation date of the file with the specified date range, we observe that the creation date, 2022-11-25 02:00:00 pm, falls within the range. Therefore, the rule is considered Successful. The output for this example would be as follows:
|
File Size Validation (FL)
Name |
File Size Validation (FL) |
Description |
The File Size Validation (FL) function is used to validate the size of a selected file. This function helps determine whether the file size falls within a specified range. |
Formula |
The File Size Validation (FL) function compares the size of the file with the specified lower and upper range to check if it is within that range. |
Supported Object Type |
The File Size Validation (FL) function can be applied to files, allowing you to validate the size of a specific file. |
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
Input |
The File Size Validation (FL) function requires the following inputs: Enter the Numeric values in the range: This input specifies the lower and upper range for the file size in bytes. |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
Example |
|
Consider a file with the following details:
Let's assume we apply the File Size Validation(FL) function to this file and specify the size range as follows: Enter the Numeric values in the range: 1000000 to 1100000 (1 megabytes (MB) to 1.1 megabytes (MB)) By comparing the size of the file with the specified range, we observe that the file size, 1.03 MB, falls within the range. Therefore, the rule is considered Successful. The output for this example would be as follows:
|
File Row Validation (FL)
Name |
File Row Validation (FL) |
|||||||||||||||||||||||||||||||||
Description |
The File Row Validation (FL) function is used to validate the number of rows in a selected file. This function helps determine whether the total row count falls within a specified range. |
|||||||||||||||||||||||||||||||||
Formula |
The File Row Validation (FL) function compares the total row count of the file with the specified lower and upper range to check if it is within that range. |
|||||||||||||||||||||||||||||||||
Supported Object Type |
The File Row Validation (FL) function can be applied to files, allowing you to validate the number of rows in a specific f |
|||||||||||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
|||||||||||||||||||||||||||||||||
Input |
The File Row Validation (FL) function requires the following inputs: Enter the Numeric values in the range: This input specifies the lower and upper range for the total row count. |
|||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. |
|||||||||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||||||||
Consider a file with the following details: The file name is "EMPLOYEE.csv" and contains the data specified below.
Total Row Count: 10 Let's assume we apply the File Row Validation (FL) function to this file and specify the row count range as follows:
By comparing the total row count of the file with the specified range, we observe that the row count, 10, falls within the range. Therefore, the rule is considered Successful. The output for this example would be as follows:
|
Data Quality Functions for File Columns
All Column Values Matches Given Fixed Value (FC)
Name |
All Column Values Matches Given Fixed Value (FC) |
|||||||||||||||||||||||||||||||||
Description |
The All Column Values Matches Given Fixed Value (FC) function is used to check if all the values in a file column match a specified fixed value. It evaluates each value in the column and compares it to the provided fixed value to determine if the condition is met. |
|||||||||||||||||||||||||||||||||
Formula |
Not Applicable |
|||||||||||||||||||||||||||||||||
Supported Object Type |
The All Column Values Matches Given Fixed Value (FC) function can be applied to file columns, allowing you to check if the values in a specific column of a file match a fixed value. |
|||||||||||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
|||||||||||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
|||||||||||||||||||||||||||||||||
Input |
The All Column Values Matches Given Fixed Value (FC) function requires the following inputs: Fixed Value: Specify the value that all the values in the file column should match. This value is provided as a parameter in the function configuration. |
|||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
|||||||||||||||||||||||||||||||||
Example |
||||||||||||||||||||||||||||||||||
Let consider a file column “Name” with the following values: File Name: "EMPLOYEE.csv"
Let's assume that we apply the All Column Values Matches Given Fixed Value (FC) function to the "Name" column and specify the fixed value as "Johnson". By comparing each value in the column to the specified fixed value, we observe the following:
In this example, the values “Robert”, “Amelia”, “Lisa”, “Johnston”, “David”, “Johnathan”, “Olivia”, “Williamjohn”, and “Harper” do not match the specified fixed value of "Johnson". Therefore, the All Column Values Matches Given Fixed Value (FC) rule fails for these values. The output for this example would be as follows:
|
First Letter Of Word Upper Case And Rest Lower Case (FC)
Name |
First Letter Of Word Upper Case And Rest Lower Case (FC) |
||||||||||||||||||
Description |
The First Letter Of Word Upper Case And Rest Lower Case (FC) function is used to check if every word in a file column has its first letter in uppercase and the rest of the letters in lowercase. It evaluates each value in the column and verifies if the condition is met. |
||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||
Supported Object Type |
The First Letter Of Word Upper Case And Rest Lower Case (FC) function can be applied to file columns. It allows you to check if the values in a specific column of a file have their first letter in uppercase and the rest of the letters in lowercase. |
||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, and string. |
||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||
Input |
The First Letter Of Word Upper Case And Rest Lower Case (FC) function does not require any specific inputs. It operates solely based on the values in the file column. |
||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||
Example |
|||||||||||||||||||
Let consider a file column “Name” with the following values: File Name: "EMPLOYEE.csv"
Let's apply the First Letter Of Word Upper Case And Rest Lower Case (FC) function to the "Name" column. By evaluating each value in the column, we observe the following:
In this example, the values "robertW" and "ameliaF" do not have their first letter in uppercase and the rest of the letters in lowercase. Therefore, the First Letter Of Word Upper Case And Rest Lower Case (FC) rule fails for these values. The output for this example would be as follows:
|
This Column Value Should Be Unique (FC)
Name |
This Column Value Should Be Unique (FC) |
||||||||||||||||||
Description |
The This Column Value Should Be Unique(FC) function is used to check if all the values in a file column are unique, meaning there are no repeated values. It evaluates each value in the column and verifies if the condition is met. |
||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||
Supported Object Type |
The This Column Value Should Be Unique (FC) function can be applied to file columns. It allows you to check if the values in a specific column of a file are unique, with no repeated values. |
||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||
Input |
The This Column Value Should Be Unique (FC) function does not require any specific inputs. It operates solely based on the values in the file column. |
||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||
Example |
|||||||||||||||||||
Let consider a file column “Name” with the following values: File Name: "EMPLOYEE.csv"
Let's apply the This Column Value Should Be Unique (FC) function to the "Name" column. By evaluating each value in the column, we observe the following:
In this example, the value "Johnson" is repeated 2 times, violating the uniqueness condition. Therefore, the This Column Value Should Be Unique (FC) rule fails for this value. The output for this example would be as follows:
|
Word Validation (FC)
Name |
Word Validation (FC) |
||||||||||||||||||||
Description |
The Word Validation(FC) function is used to check if the total number of spaces in the values of a file column meets certain criteria. It evaluates each value in the column and verifies if the conditions are satisfied. The conditions include the maximum total number of spaces allowed and the maximum number of spaces between any two words (if specified). |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Word Validation (FC) function can be applied to file columns. It allows you to validate the values in a specific column of a file based on the conditions defined. |
||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, and string. |
||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||
Input |
The Word Validation (FC) function requires the following inputs:
|
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a file column named "NOTES" with the following string values: File Name: "EMPLOYEE.csv"
Let's apply the Word Validation(FC) function to the "Text" column and set the parameters as follows: Parameter 1: Maximum total number of spaces allowed = 10 Parameter 2: Maximum number of spaces between any two words = 1 By evaluating each value in the column, we observe the following:
In this example,
The output for this example would be as follows:
|
Number Validation (FC)
Name |
Number Validation (FC) |
||||||||||||||||||||||||
Description |
The Number Validation (FC) function is used to validate the format of values in a file column, specifically for integer or floating-point formats (int/float). The function checks if the values in the column match the specified format based on the provided parameters. The parameters include the number format (int/float), the floating-point decimal limit, the integer number of digits limit, and whether the values in the column contain commas or not. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Number Validation (FC) function can be applied to file columns. It allows you to validate the format of values in a specific column of a file based on the specified number format and other criteria. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, and integer. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Number Validation (FC) function requires the following inputs:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Number Validation (FC) function to the "SCORE" column with the following parameters: Parameter 1: Number format (int/float) = int Parameter 3: Integer number of digits limit = 2 By evaluating each value in the column, we observe the following:
In this example, all values in the "SCORE" column are in the integer format and have a maximum of two digits, which matches the specified criteria. Therefore, the Number Validation (FC) rule passes, indicating that all the values in the file column respect the condition. |
Date Validation (FC)
Name |
Date Validation (FC) |
||||||||||||||||||||||||
Description |
The Date Validation (FC) function is used to validate the format and range of dates in a file column. The function checks if the date format in the column values matches the specified format and if the year of each date falls within the specified range. The format of the dates and the range of years are provided as parameters. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Date Validation (FC) function can be applied to file columns. It allows you to validate the format and range of dates in a specific column of a file. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: date, datetime, and timestamp data types. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Date Validation (FC) function requires the following inputs:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "EMP_JOINING_DATE" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Date Validation (FC) function to the "EMP_JOINING_DATE" column with the following parameters:
By validating each date in the column, we observe the following:
In this example, all the dates in the "EMP_JOINING_DATE" column match the specified format of YYYY and fall within the range of 2011 to 2014. Therefore, the Date Validation (FC) rule is successful, indicating that all the values in the file column meet the format and range criteria. The output for this example would be as follows:
|
Numeric Value Between Min And Max Values (FC)
Name |
Numeric Value Between Min And Max Values (FC) |
||||||||||||||||||||||||
Description |
The Numeric Value Between Min And Max Values (FC) function is used to validate whether the values in a file column fall within a specified range. The function checks if each value in the column is greater than or equal to the specified minimum value and less than or equal to the specified maximum value. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Numeric Value Between Min And Max Values (FC) function can be applied to file columns. It allows you to validate whether the values in a specific column of a file fall within a specified range. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: integer (int, bigint, smallint, integer) and floating-point (number, long, decimal, double, float) data types. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Numeric Value Between Min And Max Values(FC) function requires the following inputs:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Numeric Value Between Min And Max Values (FC) function to the "Values" column with the following parameters:
By validating each value in the column, we observe the following:
In this example, all the values in the "Values" column fall within the specified range of 1 to 9. Therefore, the Numeric Value Between Min And Max Values (FC) rule is successful, indicating that all the values in the file column meet the range criteria. The output for this example would be as follows:
|
Value Validation (FC)
Name |
Value Validation (FC) |
||||||||||||||||||||||||
Description |
The Value Validation (FC) function is used to validate whether all values in a file column are the same. The function checks if every value in the column matches a specific value. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Value Validation (FC) function can be applied to file columns. It allows you to validate whether all values in a specific column of a file are the same. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Value Validation (FC) function does not require any specific inputs. It compares each value in the file column to check if they are identical. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "STATUS" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Value Validation (FC) function to the "STATUS" column. Since all the values in the column are identical and match the value "Permanent", the function would result in a success. The output for this example would be:
|
Allow Only Specified Special Character (FC)
Name |
Allow Only Specified Special Character (FC) |
||||||||||||||||||||||||
Description |
The Allow Only Specified Special Character (FC) function is used to validate if the values in a file column contain only one type of special character specified in the Input Parameter. The function checks if the column values consist of only the specified special character. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Allow Only Specified Special Character (FC) function can be applied to file columns. It allows you to validate if the values in a specific column of a file contain only one type of special character. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Allow Only Specified Special Character (FC) function requires the following input:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "EMAIL" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Allow Only Specified Special Character (FC) function to the "EMAIL" column with the special character "@" specified in "Parameter 1". Since all the values in the column contain only the "@" special character, the function would result in a success. The output for this example would be:
|
Has Length Of The Value Within Range (FC)
Name |
Has Length Of The Value Within Range (FC) |
||||||||||||||||||||||||
Description |
The Has Length Of The Value Within Range (FC) function is used to validate if the length of characters in the column values falls within the specified range. It checks if the number of characters in each value of the file column is within the specified minimum and maximum range. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Has Length Of The Value Within Range (FC) function can be applied to file columns. It allows you to validate if the length of characters in a specific column of a file falls within a specified range. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Has Length Of The Value Within Range (FC) function requires the following input:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "CITY" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Has Length Of The Value Within Range (FC) function to the "CITY" column with a minimum length of 4 characters and a maximum length of 10 characters specified in "Parameter 1" and "Parameter 2" respectively. Since all the city names have a length between 4 and 15 characters, the function would result in success. The output for this example would be:
|
Columns Unique Percent Value (FC)
Name |
Columns Unique Percent Value (FC) |
||||||||||||||||||||||||
Description |
The Columns Unique Percent Value (FC) function is used to check if all the values in a file column are unique, meaning there are no repeated values. It determines the uniqueness percentage of the column values. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Columns Unique Percent Value (FC) function can be applied to file columns. It allows you to validate if the values in a specific column of a file are unique. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, and timestamp. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Columns Unique Percent Value (FC) function does not require any specific input. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "DEPARTMENT" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the Columns Unique Percent Value (FC) function to the "DEPARTMENT" column. Since the value "IT" is repeated four times, the function would result in failure. The output for this example would be:
|
Outliers To Aggregations (FC)
Name |
Outliers To Aggregations (FC) |
||||||||||||||||||||||||
Description |
The Outliers To Aggregations (FC) function is used to evaluate whether the deviation of a column value from the average of all values in the column is less than a specified percentage. Its purpose is to identify outliers within a column based on their deviation from the average. |
||||||||||||||||||||||||
Formula |
The formula for computing the deviation percentage is as follows: DEV % = (ABS(COLUMN VALUE - AVERAGE) / AVERAGE) * 100 |
||||||||||||||||||||||||
Supported Object Type |
The Outliers To Aggregations (FC) function can be applied to file columns. It allows you to identify and evaluate outliers within a file column based on their deviation from the average. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: int, bigint, number, long, decimal, double, float, smallint, integer. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Outliers To Aggregations (FC) function requires the following input:
|
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
Let's apply the "Outliers To Aggregations(FC)" function. We have a file column with the following values: [5, 10, 15, 20, 25]. The average of these values is 15. Suppose we set the parameter values as follows:
We can calculate the deviation percentage for each value in the column:
In this case, the two deviation percentages are not within the specified percentage range of 50%. Therefore, the rule is considered Failure. The output for this example would be:
|
Has No Null Value (FC)
Name |
Has No Null Value (FC) |
||||||||||||||||||||||||
Description |
The Has No Null Value (FC) function is used to determine whether a file column contains any null values. Its purpose is to validate if there are no null values present in the column. |
||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||
Supported Object Type |
The Has No Null Value (FC) function can be applied to file columns. It allows you to check for the presence of null values within a file column. |
||||||||||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||
Input |
The Has No Null Value (FC) function requires no specific input parameters. It examines the file column to determine if there are any null values present. |
||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||
Consider a file column named "SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
In this case, there is a null value present after the value 10 in the file column. Therefore, the "Has No Null Value (FC)" rule has failed. The output for this example would be:
|
Has No Digits or Special Chars (FC)
Name |
Has No Digits or Special Chars (FC) |
||||||||||||||||
Description |
The Has No Digits or Special Chars (FC) function is used to determine whether a file column contains any special characters or digits. Its purpose is to validate if the column contains only alphabetic characters and does not have any special characters or digits. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The Has No Digits or Special Chars (FC) function can be applied to file columns. It allows you to check for the presence of special characters or digits within a file column. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The Has No Digits or Special Chars (FC) function requires no specific input parameters. It examines the file column to determine if there are any special characters or digits present. |
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "ADDRESS" with the following numeric values: File Name: "EMPLOYEE.csv"
In this case, the value in the file column contains numbers and special characters. Therefore, the "Has No Digits or Special Chars (FC)" rule has failed. The output for this example would be:
|
Has Digits and Special Chars (FC)
Name |
Has Digits and Special Chars (FC) |
||||||||||||||||
Description |
The Has Digits and Special Chars (FC) function is used to determine whether all the values in a file column contain both digits and special characters. Its purpose is to validate if the column values consist of a combination of digits and special characters. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The Has Digits and Special Chars (FC) function can be applied to file columns. It allows you to check if all values in a file column contain both digits and special characters. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The Has Digits and Special Chars (FC) function requires no specific input parameters. It examines the file column values to determine if they consist of a combination of digits and special characters. |
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "ADDRESS" with the following numeric values: File Name: "EMPLOYEE.csv"
In this case, the values in the "ADDRESS" column contain both digits and special characters. Therefore, the "Has Digits and Special Chars (FC)" rule is considered successful. The output for this example would be:
|
No Leading or Trailing Spaces (FC)
Name |
No Leading or Trailing Spaces (FC) |
||||||||||||||||
Description |
The No Leading or Trailing Spaces (FC) function is used to determine whether there are any leading or trailing spaces in a file column. Its purpose is to check if the column values have any spaces at the beginning or end. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The No Leading or Trailing Spaces (FC) function can be applied to file columns. It allows you to validate if the column values have any leading or trailing spaces. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The No Leading or Trailing Spaces (FC) function requires no specific input parameters. It examines the file column values to determine if they contain any leading or trailing spaces. |
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "ADDRESS" with the following numeric values: File Name: "EMPLOYEE.csv"
In this case, the values in the "ADDRESS" column contain a leading space at the beginning. Therefore, the "No Leading or Trailing Spaces (FC)" rule is considered a failure. The output for this example would be:
|
Has Substring in String (FC)
Name |
Has Substring in String (FC) |
||||||||||||||||
Description |
The Has Substring in String (FC) function is used to determine whether a specified substring is present within the values of a file column. Its purpose is to check if the substring exists within the column values. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The Has Substring in String (FC) function can be applied to file columns. It allows you to validate if a specific substring is present within the column values. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The Has Substring in String (FC) function requires the following input:
|
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "EMAIL" with the following numeric values: File Name: "EMPLOYEE.csv"
Suppose we want to check if the substring ".com" is present in the column values. In this case, the substring ".com" is found in the value "johndoe@example.com". Therefore, the "Has Substring in String (FC)" rule is considered successful. The output for this example would be:
|
Belongs To One Of Languages (FC)
Name |
Belongs To One Of Languages (FC) |
||||||||||||||||
Description |
The Belongs To One Of Languages (FC) function is used to determine whether the values of a file column belong to a specified list of languages. Its purpose is to check if the column values are within the provided languages. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The Belongs To One Of Languages (FC) function can be applied to file columns. It allows you to validate if the values of a column are within the provided list of languages. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The Belongs To One Of Languages (FC) function requires the following input:
|
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "LANGUAGE" with the following numeric values: File Name: "EMPLOYEE.csv"
Suppose we specify "English" as the list of languages in Parameter 1. In this case, the column values contain two languages other than "English" (Spanish and French). Therefore, the "Belongs To One Of Languages (FC)" rule is considered a failure. The output for this example would be:
|
Word Count Within Range (FC)
Name |
Word Count Within Range (FC) |
||||||||||||||||
Description |
The Word Count Within Range (FC) function is used to check if the number of words in a file column falls within a specified range. Its purpose is to determine if the column values have a word count within the user-defined range. |
||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||
Supported Object Type |
The Word Count Within Range (FC) function can be applied to file columns. It allows you to validate if the number of words in a column falls within the specified range. |
||||||||||||||||
Supported Data Types |
The function supports a range of data types that includes: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string. |
||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||
Input |
The Word Count Within Range (FC) function requires the following input:
|
||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||
Example |
|||||||||||||||||
Consider a file column named "NEAREST_HOSPITAL" with the following numeric values: File Name: "EMPLOYEE.csv"
Suppose we set the minimum count (Parameter 1 = 3) and the maximum count (Parameter 2 = 4). In this case, the first two values have a word count within the specified range (3-4 words), but the third value "Nursing Home" has only two words. Therefore, the "Word Count Within Range (FC)" rule is considered a failure. The output for this example would be:
|
Column Value Same As Other Column Value (FC)
Name |
Column Value Same As Other Column Value (FC) |
||||||||||||||||||||
Description |
The Column Value Same As Other Column Value (FC) function is used to check if the values of a selected column match the values in another specified column. Its purpose is to determine if the selected column values are identical to the values in the specified column. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Column Value Same As Other Column Value (FC) function can be applied to file columns. It allows you to compare the values of one column with the values in another specified column. |
||||||||||||||||||||
Supported Data Types |
The function supports a wide range of data types, including: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||
Input |
The Column Value Same As Other Column Value (FC) function requires the following input:
|
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a file columns named "Q1_SCORE" and "Q2_SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
Suppose we set "Q2_SCORE" as Parameter 1 to compare against the values of "Q1_SCORE". In this case, all the values in "Q1_SCORE" match the corresponding values in "Q2_SCORE" except for the last two values. Therefore, the "Column Value Same As Other Column Value (FC)" rule is considered a failure. The output for this example would be:
|
Regex Match (FC)
Name |
Regex Match (FC) |
||||||||||||||||||||||||||||||||||||||||||||
Description |
The Regex Match (FC) function is used to verify if all values in a column match a specified regular expression pattern. It checks whether the values in the column adhere to the provided regex format and determines the success or failure of the rule. |
||||||||||||||||||||||||||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||||||||||||||||||||||||||
Supported Object Type |
The Regex Match (FC) function can be applied to file columns. It allows you to validate the values in a column against a specified regex pattern. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Data Types |
The function supports a variety of data types, including: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||||||||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||||||||||||||||||||||||||
Input |
The Regex Match (FC) function requires the following input:
|
||||||||||||||||||||||||||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||||||||||||||||||||||||||
Example |
|||||||||||||||||||||||||||||||||||||||||||||
Let's consider a file column "NOTES" containing a set of values on which the Regex Match rule is applied.
The regular expression pattern ^[A-Za-z0-9]*$ can be broken down as follows:
Putting it all together, the pattern ^[A-Za-z0-9]*$ will match any string that consists of zero or more uppercase letters, lowercase letters, or digits. It will not match strings containing other characters or special symbols. Example: "Hello" or “HELLO” or “Hello123”This input string consists of only uppercase letters, lowercase letters, and digits. It matches the pattern because it contains zero or more characters that fall within the range specified by the character class [A-Za-z0-9]. Example: "889976545"This input string consists of only digits. It matches the pattern because it contains zero or more characters that fall within the range specified by the character class [A-Za-z0-9]. Example: "Hello OvalEdge!" or “Hello 234!” or “HELLO-234&”This input string contains a space, dash, and an exclamation mark, which are not included in the character class [A-Za-z0-9]. It does not match the pattern because it fails to meet the requirement of having only uppercase letters, lowercase letters, and digits. Example: ""This input string is an empty string. It matches the pattern because it contains zero characters, which satisfies the condition of zero or more characters specified by the * quantifier. The output for this example would be:
|
Column Value Matches One Of Given Values (FC)
Name |
Column Value Matches One Of Given Values (FC) |
||||||||||||||||||||
Description |
The Column Value Matches One Of Given Values (FC) function is used to determine if any value in the selected column matches with the value(s) provided in Parameter 1. It evaluates whether any of the values in the column match the specified value(s) and determines the success or failure of the rule. |
||||||||||||||||||||
Formula |
Not Applicable |
||||||||||||||||||||
Supported Object Type |
The Column Value Matches One Of Given Values(FC) function can be applied to file columns. It allows you to compare the values in a column against the provided value(s). |
||||||||||||||||||||
Supported Data Types |
The function supports a variety of data types, including: character, char, ntext, tinytext, text, mediumtext, longtext, nchar, varchar, nvarchar, varchar2, nvarchar2, string, int, bigint, number, long, decimal, double, float, smallint, integer, date, datetime, timestamp. |
||||||||||||||||||||
Supported Connectors |
Files: S3, NFS, and Azure Data Lake. |
||||||||||||||||||||
Input |
The Column Value Matches One Of Given Values(FC) function requires the following inputs:
|
||||||||||||||||||||
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
||||||||||||||||||||
Example |
|||||||||||||||||||||
Consider a file columns named "Q1_SCORE" and "Q2_SCORE" with the following numeric values: File Name: "EMPLOYEE.csv"
Suppose we set the input parameters as follows:
In this case, the value "Doe" from Parameter 1 matches with the value "Doe" in the column, and the value "Warner" from Parameter 1 matches with the value "Warner" in the column. Therefore, the "Column Value Matches One Of Given Values (FC)" rule is considered successful. The output for this example would be:
|
Data Quality Functions for Query
Sql Exact Value (SQL)
Name |
Sql Exact Value (SQL) |
Description |
The Sql Exact Value(SQL) function is used to determine the success or failure based on the return value of an associated SQL code object in the source system. If the SQL code object returns a value of 1 after execution, the rule is considered successful. Otherwise, it is considered a failure. |
Formula |
Not Applicable |
Supported Object Type |
The Sql Exact Value(SQL) function is applicable to queries. It allows you to execute SQL code and evaluate the return value. |
Supported Connectors |
RDBMS: MySQL, SQL Server, PostgreSQL, Snowflake and more. |
Input |
Not Applicable |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
Example |
|
Consider an SQL code object that checks if a specific condition is met in a table. The SQL code is as follows: SQL Code: SELECT COUNT(*) FROM employees WHERE salary > 50000; The objective is to determine if the number of employees with a salary greater than 50,000 is 1 or 0. Based on this, the Sql Exact Value(SQL) rule will evaluate the success or failure of the rule. Let's assume that the SQL code object returns a value of 1. In this case, the Sql Exact Value(SQL) rule will be considered successful. On the other hand, if the SQL code object returns a value of 0, the rule will be considered a failure. The output of the Sql Exact Value(SQL) rule will provide the following information:
For example, let's say the rule was executed 10 times. Out of these, the SQL code object returned a value of 1 in 8 instances and a value of 0 in 2 instances. Output:
In this case, the rule would be considered a failure because the SQL code object returned a value of 0 in 2 instances, indicating that the condition (salary > 50000) was not met for those employees. |
Custom Sql Exact Value (SP)
Name |
Custom Sql Exact Value (SP) |
Description |
The Custom Sql Exact Value (SP) function determines the success or failure based on the return value of an associated code object (stored procedure) in the source system. If the code object returns a value of 1 after execution, the rule is considered successful. Otherwise, it is considered a failure. |
Formula |
Not Applicable |
Supported Object Type |
The Custom Sql Exact Value (SP) function is applicable to queries. It allows you to execute code objects (stored procedure) and evaluate the return value. |
Supported Connectors |
RDBMS: MySQL, SQL Server, PostgreSQL, Snowflake and more. |
Input |
The Custom Sql Exact Value (SP) function requires the following inputs:
Example: Input,Salary,140000.4500,DECIMAL;Input,Age,14,INT;OUTPUT,ReturnSalary,INT; |
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
Example |
|
Consider a stored procedure that performs certain calculations and returns a value of 1 or 0 based on specific conditions. The objective is to evaluate the success or failure of the rule based on the return value of the stored procedure. Let's assume the stored procedure is designed to return a value of 1 if certain conditions are met and 0 otherwise. In this case, if the stored procedure returns a value of 1, the Custom Sql Exact Value (SP) rule will be considered successful. Conversely, if the stored procedure returns a value of 0, the rule will be considered a failure. The output of the Custom Sql Exact Value (SP) rule will provide the following information:
For example, let's say the rule was executed 10 times. Out of these, the stored procedure returned a value of 1 in 8 instances and a value of 0 in 2 instances. Output:
In this case, the rule would be considered a failure because the stored procedure returned a value of 0 in 2 instances, indicating that the specified conditions were not met. |
Sql Value Contains (SQL)
Name |
Sql Value Contains (SQL) |
Description |
The Sql Value Contains (SQL) rule determines the success or failure based on the return value of an associated code object (query) in the source system. The rule is considered successful if the code object returns a value that exists in the specified list of values after execution. Otherwise, it is considered a failure. |
Formula |
Not Applicable |
Supported Object Type |
The Sql Value Contains (SQL) function is applicable to queries. It allows you to execute SQL code and evaluate the return value. |
Supported Connectors |
RDBMS: MySQL, SQL Server, PostgreSQL, Snowflake and more. |
Input |
The Sql Value Contains (SQL function requires the following inputs:
|
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
Example |
|
Consider a scenario where a code object (query) is executed and returns a specific value. Let's assume the code object returns a value of "B". We have a list of values [A, B, C, D, E]. In this case, the Sql Value Contains (SQL) rule will be considered successful because the returned value "B" exists in the list. Output:
|
Sql Value Range (SQL)
Name |
Sql Value Range (SQL) |
Description |
The Sql Value Range (SQL) rule determines the success or failure based on the return value of an associated SQL code object in the source system. The rule is considered successful if the code object returns a value that falls within the specified range of values after execution. Otherwise, it is considered a failure. |
Formula |
Not Applicable |
Supported Object Type |
The Sql Value Range (SQL) function is applicable to queries. It allows you to execute SQL code and evaluate the return value. |
Supported Connectors |
RDBMS: MySQL, SQL Server, PostgreSQL, Snowflake and more. |
Input |
The Sql Value Range (SQL function requires the following inputs:
|
Output |
The output of the function can be one of the following: Success or Failure or Undetermined or Not Executed or Execution Failed. On Failure, the rule shows the Passed Count, Failed Count, Total Count, and Failed Values. |
Example |
|
Let's consider the following SQL query as the associated code object: SQL Code: SELECT COUNT(*) FROM employees WHERE salary > 50; In this example, the SQL code object counts the number of employees whose salary is greater than 50. To apply the Sql Value Range (SQL) rule, we define the input values as follows:
Based on the specified range, the Sql Value Range (SQL) rule will evaluate the success or failure of the rule. Suppose the query returns a value of 75. In this case, the Sql Value Range (SQL) rule will be considered successful because the returned value falls within the specified range of 0 to 100. Output:
In this example, the rule passes as the returned value of 75 falls within the specified range. |
Copyright © 2023, OvalEdge LLC, Peachtree Corners GA USA