Guardrails Configuration

You can configure guardrails using aidputils as part of selecting a foundational model using OCIAIConf().

Guardrails configuration is provided when selecting a foundational model from the OCI Generative AI service. In this example, we select the xai.grok-4 model:

from aidputils.agents.toolkit.configs import OCIAIConf 
guardrails_config = { 
    "name" : "<guardrailsName>", 
    "description" : "<guardrailsDescription>", 
    "policies" : [ ] 
  } 
model_args = {} 
llm_conf = OCIAIConf(model_provider='generic', 
                     compartment_id='<compartment_ocid>', 
                     model_args=model_args, 
                     endpoint='https://inference.generativeai.<oci-region>.oci.oraclecloud.com', 
                     model_id='xai.grok-4', 
                     guardrails_config=guardrails_config)

The guardrails config is a JSON-like string consisting of an array of policies. In the above example, it is defined in this block of code where <guardrailsName> and <guardrailsDescription> are a user-defined name and description:


guardrails_config = { 
    "name" : "<guardrailsName>", 
    "description" : "<guardrailsDescription>", 
    "policies" : [ ] 
  }

Each policy has the following keys:

Key Required Description Data Type Default value
policyName No Custom name for the policy String N/A
policyType Yes Type of guardrail policy to apply.
Allowed values include:
  • CONTENT_MODERATION
  • PROMPT_ATTACKS_PREVENTION
  • PII_DETECTION
ENUM  
policyDescription No A description for the policy String  
scope No The scope defines where the guardrails are applied.
Allowed values include:
  • USER_REQUEST
  • AGENT_RESPONSE
  • BOTH
ENUM  
action No The action to take when the policy is violated
Allowed values include:
  • INFORM
  • BLOCK
  • ALLOW MASK

    (only for PII_DETECTION)

ENUM  
threshold No Threshold for detection.

Range is a probability between 0 and 1.

float  
piiCategories Yes Category of PII data to be detected along with their action and enablement. Array  

piiCategories is also an array of JSON-like objects that uses the following keys:

Key Required Description Data Type Default value
category Yes The PII category to detect.
Allowed values include:
  • PERSON
  • ADDRESS
  • TELEPHONE_NUMBER
  • EMAIL
String N/A
isEnabled No Enable the detection of the PII category.
Allowed values include:
  • True
  • False
ENUM  
action No Action to take if PII category is detected. Override the action above.
Allowed values include:
  • INFORM
  • BLOCK
  • ALLOW
  • MASK
String  

Example: Complete Guardrails Configuration

In this case, we apply all three policies:
  • content moderation is only applied on the agent response,
  • prompt injection will block user requests if detected,
  • PII is detected on both agent response and user request. Each PII category is treated differently.
guardrails_config = { 
    "policies" : [ { 
      "policyType" : "CONTENT_MODERATION", 
      "policyName" : "Content Moderation prevention", 
      "policyDescription" : "Choose an action to take when hate, sexual, violence, toxic, derogatory, or harassment content is detected in either the user input query or the agent response.", 
      "scope" : "AGENT_RESPONSE", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "categories" : [ ] 
    }, { 
      "policyType" : "PROMPT_ATTACKS_PREVENTION", 
      "policyName" : "Prompt Injection prevention", 
      "policyDescription" : "Choose action when prompt injection is detected on the user query.", 
      "scope" : "USER_REQUEST", 
      "action" : "BLOCK", 
      "threshold" : 0.5 
    }, { 
      "policyType" : "PII_DETECTION", 
      "policyName" : "Personally Identifiable Information (PII) detection", 
      "policyDescription" : "Choose an action to take when PII entities are detected in either the user input query or the agent response.", 
      "scope" : "AGENT_RESPONSE", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "piiCategories" : [ { 
        "category" : "PERSON", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      }, { 
        "category" : "ADDRESS", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      }, { 
        "category" : "TELEPHONE_NUMBER", 
        "isEnabled" : True, 
        "action" : "MASK" 
      }, { 
        "category" : "EMAIL", 
        "isEnabled" : True, 
        "action" : "MASK" 
      } ] 
    }, { 
      "policyType" : "PII_DETECTION", 
      "policyName" : "Personally Identifiable Information (PII) detection", 
      "policyDescription" : "Choose an action to take when PII entities are detected in either the user input query or the agent response.", 
      "scope" : "USER_REQUEST", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "piiCategories" : [ { 
        "category" : "PERSON", 
        "isEnabled" : True, 
        "action" : "INFORM" 
      }, { 
        "category" : "ADDRESS", 
        "isEnabled" : True, 
        "action" : "INFORM" 
      }, { 
        "category" : "TELEPHONE_NUMBER", 
        "isEnabled" : True, 
        "action" : "BLOCK" 
      }, { 
        "category" : "EMAIL", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      } ] 
    } ] 
  }