Create Sensitive Types and Categories

In Oracle Data Safe, you can create your own sensitive types and sensitive categories.

Create a Sensitive Type

When creating a sensitive type, you can provide one or more patterns (regular expressions) that should be used to discover sensitive columns. You can provide a column name pattern, column comment pattern, column data pattern, and a search type (AND/OR). Data Discovery performs case-insensitive pattern matching.

For a user-defined sensitive type, you can assign a default masking format, is used to mask the columns discovered using this sensitive type. When creating a user-defined sensitive type, you must assign it to a compartment.

  1. Under Security Center, click Data Discovery.
  2. Under Related Resources in Security Center, click Sensitive Types.
  3. Click Create Sensitive Type / Category.
    The Create Sensitive type / Category window is displayed.
  4. In the Name field, enter a name for your sensitive type.
  5. From the Compartment drop-down list, select the compartment in which you want to store the sensitive type.
  6. (Optional) In the Description box, enter an explanation of your sensitive type.
  7. From the Parent Sensitive Category drop-down list, select the sensitive category to which you want your sensitive type to belong.
    • If needed, click Change Compartment and select a different compartment.
    • You can choose a user-defined sensitive category as a parent category, but not a category used by predefined sensitive types.
  8. Leave the Sensitive Type tile selected.
  9. (Optional) To use a predefined sensitive type as a starting point, select a predefined sensitive type from the Create Like drop-down list. If you want to select a user-defined sensitive type and it is located in a different compartment, click Change Compartment, browse to and select the correct compartment. The compartment doesn't matter if you are selecting an Oracle predefined sensitive type.
  10. Configure one or more of the following patterns.
    • Column Name Pattern: Enter a regular expression that should be used to match column names.
    • Column Comment Pattern: Enter a regular expression that should be used to match column comments.
    • Column Data Pattern: Enter a regular expression that should be used to match column data.
  11. For Search Type, select Or or And.
    • The Or operator means that any of the patterns can match for a candidate sensitive column.
    • The And operator means that all of the patterns must match for a candidate sensitive column.
    • If the column doesn't include a comment, the column comment pattern matching is skipped. Similarly, if the column doesn’t contain data, the data pattern matching is also skipped.
  12. (Optional) From the Default Masking Format drop-down list, select a masking format for Oracle Data Safe to use by default when masking sensitive columns discovered by your sensitive type.
  13. (Optional) Click Show Advanced Options, and define tags.
  14. Click Create Sensitive Type.
    The Sensitive Type Details page is displayed.

Create a Sensitive Category

  1. Under Security Center, click Data Discovery.
  2. Under Related Resources in Security Center, click Sensitive Types.
  3. Click Create Sensitive Type / Category.
    The Create Sensitive type / Category window is displayed.
  4. In the Name field, enter a name for your sensitive category.
  5. Select a compartment in which to store your sensitive category.
  6. (Optional) Enter a brief description of your sensitive category.
  7. (Optional) Select a parent sensitive category.
  8. Click the Sensitive Category tile.
    Notice that button at the bottom changes from Create Sensitive Type to Create Sensitive Category.
  9. (Optional) Click Show Advanced Options, and define tags. Click Create Sensitive Category.

Tips for Creating Sensitive Types

The following topics help you to write patterns for sensitive types. For more information about regular expressions, see Regular Expressions.

Column Name Pattern

A column name pattern is a regular expression that is used to match column names during data discovery. For example, to search for columns containing Social Security numbers, you could define the following column name pattern:

(^|[_-])SSN($|[_-])|(SSN|SOC.*SEC.*).?(ID|NO|NUMBERS?|NUM|NBR|#)

The regular expression checks for specific keywords in column names. It matches column names, such as PATIENT_SSN, SSN#, SOCIAL_SECURITY_NUMBER, and EMPLOYEE_SOC_SEC_NO.

Tips for creating column name patterns:

  • Consider when to use .? and .*. Use .? if you want to allow zero or one character, and use .* to allow any number of characters. For example, you could use SOCIAL.?SECURITY.?NUMBER or SOC.*SEC.*NUMBER depending upon how strict you want the regular expression to be.
  • To get an exact match of a word or a match if the word is part of a column name, use (^|[_-])<WORD>($|[_-]). The pattern finds an exact match and variations of <WORD> plus the characters _- before or after the word.
  • Whenever searching for columns containing numbers, you could use keywords like (ID|NO|NUMBERS?|NUM|NBR|#).
  • To match singular and plural words, if applicable, use S?. For example, use CODES? to match CODE and CODES.
  • To match dates, use (DT|DATE) and the reverse pattern. For example, you could use the following pattern to match BIRTH_DATE and DATE_OF_BIRTH:
    BIRTH.?(DT|DATE)|(DT|DATE).*BIRTH

Column Comment Pattern

A column comment pattern is a regular expression that is used to match column comments during data discovery. Sometimes column names are obscure and therefore, metadata is entered as a comment for a database column. Data Discovery can search these comments and potentially find more sensitive data. For example, to search for columns containing Social Security numbers, you could define the following column comment pattern:

\bSSN#?\b|SOCIAL SECURITY (ID|NUM|\bNO\b|NBR)

The regular expression checks for specific keywords in column comments. For example, it matches the column comment Contains social security numbers of employees.

Tips for creating column comment patterns:

  • Avoid using .* in column comments to reduce false positives.
  • Use \b<word>\b to search for a specific word. It avoids matching words that contain <word>. For example, the regular expression \bNO\b matches social security no but not social security notification. Similarly, the regular expression \bSECT\b does not match the word SECTOR, and \bCULT\b does not match the word CULTURE.
  • Whenever searching for columns containing numbers, you can use keywords like (ID|\bNO\b|NUM|NBR|#).

Column Data Pattern

A column data pattern is a regular expression that is used to match the actual column data during data discovery. For example, to search for columns containing Social Security numbers, you could define the following column data pattern:

^[0-9]{3}[ -]?[0-9]{2}[ -]?[0-9]{4}$

The regular expression checks for 9-digit numbers. A number can be either numeric or can have three parts separated by hyphens or spaces. It matches numbers like 383368610 and 383-36-8610.

Tips for creating column data patterns:

  • Ensure that the data pattern is as specific as possible to avoid false positives.
  • See whether it is logical to have a data pattern. If the data pattern is too broad, it can result in false positives. If it does not add any value, you could decide not to add the data pattern for a sensitive type.
  • If you want to use a broad data pattern, you could use the And search operator to reduce false positives.

Search Pattern

The search pattern indicates how the column name, comment and data patterns of a sensitive type should be used to discover sensitive columns. There are two search options: AND and OR.

The AND search option ensures that all the provided patterns of a sensitive type must match for identifying a column as sensitive. For example, if a sensitive type has name, comment, and data patterns, they must match a column's name, comment, and data respectively, for identifying that column as sensitive. The following table covers the various possible combination of the patterns provided for a sensitive type and the corresponding AND search behavior.

Patterns Present in a Sensitive Type Search Behavior
Name, Comment, and Data Name AND Comment AND Data
Name and Data Name AND Data
Name and Comment Name AND Comment
Comment and Data Comment AND Data
Name Name
Comment Comment
Data Data

The OR search option provides some flexibility to identify a column as sensitive even if only some of the patterns of a sensitive type match. For example, if a sensitive type has name and comment patterns, a column is identified as sensitive even if only the name pattern (or comment pattern) matches the column's name (or comment). If a sensitive type has all three patterns, the data pattern must match along with either the name pattern or the comment pattern (or both). The following table covers the various possible combination of the patterns provided for a sensitive type and the corresponding OR search behavior.

Patterns Present in a Sensitive Type Search Behavior
Name, Comment, and Data Data OR (Name AND Data) OR (Comment AND Data)
Name and Data Data OR (Name AND Data)
Name and Comment Name OR Comment
Comment and Data Data OR (Comment AND Data)
Name Name
Comment Comment
Data Data