Guardrails Configuration

You can configure guardrails using aidputils as part of selecting a foundational model using OCIAIConf().

Guardrails configuration is provided when selecting a foundational model from the OCI Generative AI service. In this example, we select the xai.grok-4 model:

from aidputils.agents.toolkit.configs import OCIAIConf 
guardrails_config = { 
    "name" : "<guardrailsName>", 
    "description" : "<guardrailsDescription>", 
    "policies" : [ ] 
  } 
model_args = {} 
llm_conf = OCIAIConf(model_provider='generic', 
                     compartment_id='<compartment_ocid>', 
                     model_args=model_args, 
                     endpoint='https://inference.generativeai.<oci-region>.oci.oraclecloud.com', 
                     model_id='xai.grok-4', 
                     guardrails_config=guardrails_config)

The guardrails config is a JSON-like string consisting of an array of policies. In the above example, it is defined in this block of code where <guardrailsName> and <guardrailsDescription> are a user-defined name and description:


guardrails_config = { 
    "name" : "<guardrailsName>", 
    "description" : "<guardrailsDescription>", 
    "policies" : [ ] 
  }

Each policy has the following keys:

Key	Required	Description	Data Type	Default value
`policyName`	No	Custom name for the policy	String	N/A
`policyType`	Yes	Type of guardrail policy to apply. Allowed values include: `CONTENT_MODERATION` `PROMPT_ATTACKS_PREVENTION` `PII_DETECTION`	ENUM
`policyDescription`	No	A description for the policy	String
`scope`	No	The scope defines where the guardrails are applied. Allowed values include: `USER_REQUEST` `AGENT_RESPONSE` `BOTH`	ENUM
`action`	No	The action to take when the policy is violated Allowed values include: `INFORM` `BLOCK` `ALLOW MASK` (only for `PII_DETECTION`)	ENUM
`threshold`	No	Threshold for detection. Range is a probability between 0 and 1.	float
`piiCategories`	Yes	Category of PII data to be detected along with their action and enablement.	Array

piiCategories is also an array of JSON-like objects that uses the following keys:

Key	Required	Description	Data Type	Default value
`category`	Yes	The PII category to detect. Allowed values include: `PERSON` `ADDRESS` `TELEPHONE_NUMBER` `EMAIL`	String	N/A
`isEnabled`	No	Enable the detection of the PII category. Allowed values include: `True` `False`	ENUM
`action`	No	Action to take if PII category is detected. Override the action above. Allowed values include: `INFORM` `BLOCK` `ALLOW` `MASK`	String

Example: Complete Guardrails Configuration

In this case, we apply all three policies:

content moderation is only applied on the agent response,
prompt injection will block user requests if detected,
PII is detected on both agent response and user request. Each PII category is treated differently.

guardrails_config = { 
    "policies" : [ { 
      "policyType" : "CONTENT_MODERATION", 
      "policyName" : "Content Moderation prevention", 
      "policyDescription" : "Choose an action to take when hate, sexual, violence, toxic, derogatory, or harassment content is detected in either the user input query or the agent response.", 
      "scope" : "AGENT_RESPONSE", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "categories" : [ ] 
    }, { 
      "policyType" : "PROMPT_ATTACKS_PREVENTION", 
      "policyName" : "Prompt Injection prevention", 
      "policyDescription" : "Choose action when prompt injection is detected on the user query.", 
      "scope" : "USER_REQUEST", 
      "action" : "BLOCK", 
      "threshold" : 0.5 
    }, { 
      "policyType" : "PII_DETECTION", 
      "policyName" : "Personally Identifiable Information (PII) detection", 
      "policyDescription" : "Choose an action to take when PII entities are detected in either the user input query or the agent response.", 
      "scope" : "AGENT_RESPONSE", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "piiCategories" : [ { 
        "category" : "PERSON", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      }, { 
        "category" : "ADDRESS", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      }, { 
        "category" : "TELEPHONE_NUMBER", 
        "isEnabled" : True, 
        "action" : "MASK" 
      }, { 
        "category" : "EMAIL", 
        "isEnabled" : True, 
        "action" : "MASK" 
      } ] 
    }, { 
      "policyType" : "PII_DETECTION", 
      "policyName" : "Personally Identifiable Information (PII) detection", 
      "policyDescription" : "Choose an action to take when PII entities are detected in either the user input query or the agent response.", 
      "scope" : "USER_REQUEST", 
      "action" : "INFORM", 
      "threshold" : 0.5, 
      "piiCategories" : [ { 
        "category" : "PERSON", 
        "isEnabled" : True, 
        "action" : "INFORM" 
      }, { 
        "category" : "ADDRESS", 
        "isEnabled" : True, 
        "action" : "INFORM" 
      }, { 
        "category" : "TELEPHONE_NUMBER", 
        "isEnabled" : True, 
        "action" : "BLOCK" 
      }, { 
        "category" : "EMAIL", 
        "isEnabled" : False, 
        "action" : "INFORM" 
      } ] 
    } ] 
  }