We have various Data Quality Rule types for the following:
- Tables
- Table Columns
- File Columns
Data Quality Rule types for Tables
Rule Type |
Parameter |
DENSITY PERCENT: Returns TRUE if the computed density percentage falls in the range else FALSE |
Param-1: Lower Range Param-2: Upper Range |
NULL DENSITY PERCENT: Returns TRUE if the computed null density percentage falls in the range else FALSE |
Param-1: Lower Range Param-2: Upper Range |
TOTAL ROW COUNT: Returns TRUE if the total row count in table falls in the range else FALSE |
Param-1: Lower Range Param-2: Upper Range |
Data Quality Rule types for Table Columns
Rule Type |
Parameter |
AVERAGE: Returns TRUE if the average computed falls in the range else FALSE |
Param-1: Lower Range Param-2: Upper Range |
DENSITY PERCENT: Returns TRUE if the density computed falls in the range else FALSE |
Param-1: Lower Range Param-2: Upper Range |
DISTINCT: Returns TRUE if the count of distinct values falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
MAX: Returns TRUE if the min value falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
MIN: Returns TRUE if the max value falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
NULL COUNT: Returns TRUE if the count of null values falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
NULL DENSITY PERCENT: Returns TRUE if the computed null density falls in the percentage range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
STD DEVIATION: Returns TRUE if the computed standard deviation falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
SUM: Returns TRUE if the sum of all cells computed falls in the range; else FALSE |
Param-1: Lower Range Param-2: Upper Range |
TOP VALUES: Returns TRUE if the top values of column matches with at least one value in param 1; else FALSE |
Param-1: Comma separated values that user is expecting to be in top values |
VALIDATE EMAIL: Returns TRUE if format of the elements in the column is “%_@_%._%”, else FALSE |
No Parameter |
VALIDATE PATTERN: Returns TRUE if the defined pattern (U,L,D or combination of any) matches with the column pattern; else FALSE |
Param-1: Pattern Format |
VALIDATE REGEX: Returns TRUE if the defined REGEX matches with the column values; else FALSE |
Param-1: REGEX |
Data Quality Rule types for File columns
Rule Type |
Parameter |
HAS_NOT_NULL:(Has no null value) returns true if there is no null value; else it returns false. |
No Parameter |
HAS_NO_DIGITS_OR_SPECIAL_CHARS:(Has No digits or special chars) returns true if there are no special characters or any digits else returns false. |
No Parameter. |
HAS_FIRST_AND_LAST_CHAR:(No leading or trailing spaces) returns true if there are no leading or trailing spaces, else it returns false. |
No Parameter |
HAS_SUBSTRING_IN_STRING:(Has substring in string) returns true if the substring value is present in the given string; else it returns false. |
Param1: it contains a substring value. |
HAS_LANGUAGE:(Belongs to one of the languages) returns true if the language of the column matches the provided language passed in parameter1; else returns false. |
Param1: it contains the language to be specified in the column data. |
HAS_WORD_RANGE:(Word count within range)rule returns true if the number of words present in the column data is in the range given by the user; otherwise returns false. |
Param-1: Minimum number of words Param-2: Maximum number of words |
HAS_EXACT_VALUE:(Column Value same as another Column Value) returns true if the value of the column given in parameter1 matches with the value given in the column value on which we are running the rule; else it returns false. |
Param1: “column name” to be matched. |
HAS_REGEX_MATCH:(Regex match) returns true if the data in the column matches with the regex given in the parameter1 else it returns false. |
Param1: Regex to be matched. |
HAS_COLUMN_VALUE_MATCH:(Column Value matches one of given values) returns true if the value of the given column matches with any one of the values of the parameter1 else it returns false. |
Param1: It contains the values to be matched with a given column value. Param2: Separator. |
HAS_COLUMN_MATCH_FIXED_VALUE :(All Column Values matches given fixed value) returns true if the value in the column matches with the fixed value given in the parameter1 else returns false. |
Param1: Value to be matched. |
HAS_FIRST_UPPER_REST_LOWER:(First Letter of Word Upper case and Rest Lower Case) returns true if the first letter of the word is in uppercase and the rest of the word is in lowercase; else it returns false. |
No Parameter |
HAS_UNIQUE_VALUE:(This Column Value Should Be Unique) returns true if the value in the column is unique i.e. (No duplicate value) else it returns false |
No Parameter |
WORD_VALIDATION:(Word Validation) returns true if the total number of spaces in the column data are less than or equal to the value given in the param1 else it returns false. If we want to limit the number of spaces between words, we need to pass the limit in param2. If the no spaces count between words exceeds the limit it returns false else it returns true. |
Param-1: Total no of spaces allowed. Param-2: Total no of spaces allowed between words. |
NUMBER_VALIDATION:(Number Validation) returns true if the data type in the column(int/float) value matches the parameter1 value which we are passing, otherwise it returns false. If column data type is float and we want to limit the decimal places then we pass decimal limits in parameter2.if decimals exceed the param2 limit then it returns false else true. if column data type is int and we want to limit the number of digits then we pass decimal limits in parameter3. If no of digits exceed the limit in param3 returns false, else true. If we do not want any “commas” in the column data we should pass: “N” else we should pass “T” in param4. If we pass “Y” and we have commas in column data returns true, else returns false. If we pass “N” in param4 and we have no commas in the column data, it returns true, else returns false. |
param1: int/float param2: floating point decimals limit param3: integer number of digits limit param4:” Y”/” N” (if user wants commas, then “Y” else “N”) |
DATE_VALIDATION: (Date Validation) returns true if the date format in values matches with the date format given in parameter1; else returns false. If we want to limit the years between the range, we need to pass the minimum year in param2 and maximum year in param3. If the year in column data is in between the minimum and maximum year specified by the user then it returns true, else returns false. (Exclusive values). |
param1- which contains the format. param-2 which contains the minimum year. param-3 which contains the maximum year. |
HAS_DIGITS_AND_SPECIAL_CHARS:(Has digits and special chars) returns true if the value in the column is alphanumeric and it should contain digits and special characters; else it returns false. There is no need for passing parameters. |
No Parameter |
HAS_CONSTANT_VALUE: (Value Validation) returns true if the column has the same value in all the rows i.e. (constant value) then it returns false. There is no need for passing parameters. |
No Parameter |
HAS_ONLY_ONE_SPECIAL_CHAR:(Allow only one special Character) returns true if the column values have only one type of special character which passes in parameter1 else returns false. We must pass the special character in param1 which we want to be in column data. |
Param1: special character allowed. |
HAS_LENGTH_RANGE:(Has length of the value within range) returns true if the character in the value is within the range; else returns false. We must pass the minimum value range in param1 and maximum value range in param2. |
Param1: minimum no of characters allowed. Param2: maximum no of characters allowed. |
HAS_NUMERIC_VALUE_RANGE :(Numeric Value between min and max) returns true if the number value is within the range, else returns false. We must pass the minimum value range in param1 and maximum value range in param2. |
Param1: minimum number allowed. Param2: maximum number allowed. |