AI for Data Classification

Introduction to Data Classification

Summary

Data Classification is a process of consistently categorizing data based on specific and pre-defined criteria so that this data can be efficiently and effectively protected. It also aids in categorizing data objects/fields that are associated with every term under each domain.

The creation of Classification under each domain and how to classify the terms/ fields are covered in the Business Glossary module.

Purpose of the Data Classification feature:

This module helps OvalEdge users to perform the following tasks:

  • Create a data classification policy for each domain.
  • Create a Global Classification Domain policy. 
  • For each domain, filter data objects to display tables/columns/files/reports based on the data classification to secure data.
  • Identify unclassified data objects and add terms in bulk.
  • Identify unclassified data objects and add the tags in bulk.
  • Recommend terms for each unclassified data column.
  • By a single click choose and add term recommendations on unclassified data columns.

OvalEdge has gone one step ahead and enabled to create various identifiers (like confidential, PII, Sensitive) so that users can classify and organize their data objects with ease.  Users can classify their data objects as sensitive/ confidential or PII by attaching the identifier to the term before PUBLISHING.

  1. Sensitive means that this is sensitive information of a person and should not be disclosed.  For example, in a healthcare company, information about a person's health record is sensitive information and the data should be Masked.
  2. Confidential information should always be restricted from access. Users should not be able to see the existence of such columns. 
    For example, in a healthcare pharmaceutical company, any column data related to the composition of the drug is classified as confidential.
  3. PII means that it is Personable identifiable information.
    For example Name, Email, Phone, etc.

More about PII’s

What is PII?

Personally Identifiable Information is information that identifies a distinct individual. Examples of sensitive PII include Social Security Numbers (SSN), Driving License Numbers, Email IDs, and Revenue details. Responsible users should restrict or mask the column data so other users have no access to it. However, not all PII’s are sensitive or confidential.

Importance of protecting PII:

PII information that is either sensitive or confidential should be encrypted for the purpose of data privacy and security. When users need to make the data unavailable in the wrong hands, they need to either mask or restrict the access of data.

How to Mask or Restrict data using Terms:

Any column data can be identified as a PII by either masking or restricting the information. Masking and restricting data are done in two ways.

  • PII tags can only be added to a column level. OE_ADMIN’s can enable column-level security before providing access to user roles through the Administration - Security module. 

The below table displays the list of combinations set at the time of enabling column level security,

  • When Enforce masking is set to YES, the data in the column is displayed as XXXX. The content in the column is hidden but the column is still visible.
  • When Enforce restriction is set to YES, the total column of that table goes invisible.

  • Term Approvers accomplish masking and restricting column data and columns by tagging business glossary terms to a specific column. User needs to create a Domain before creating a term to mask and restrict the data.