Add AI Guardrails to OCI Generative AI Dedicated AI Cluster Endpoints
- Services: Generative AI
- Release Date: February 26, 2026
You can now enable AI guardrails for content moderation (CM), prompt injection (PI), and personally identifiable information (PII) on OCI Generative AI endpoints hosted on dedicated AI clusters. This feature supports chat and text embedding endpoints in commercial regions, and is available in the Console and through the API, SDK, and CLI.
You can add guardrails when creating or updating an endpoint and select one of the following modes:
- Inform: Runs inference and returns guardrail results in the response for review.
- Block: Aims to reject requests when violations are detected. Example response: HTTP 400: “Inappropriate content detected!”).
Guardrails are enforced in real time as part of endpoint inference. For API examples and setup details, see About OCI Generative AI Guardrails, Creating an endpoint , and Updating an endpoint.
Disclaimer
Our Content Moderation (CM) and Prompt Injection (PI) guardrails have been evaluated on a range of multilingual benchmark datasets. However, actual performance may vary depending on the specific languages, domains, data distributions, and usage patterns present in customer-provided data as the content is generated by AI and may contain errors or omissions. Accordingly, it is intended for informational purposes only, should not be considered professional advice and OCI makes no guarantees that identical performance characteristics will be observed in all real-world deployments. The OCI Responsible AI team is continuously improving these models.
Our content moderation capabilities have been evaluated against RTPLX, one of the largest publicly available multilingual benchmarking datasets, covering more than 38 languages. However, these results should be interpreted with appropriate caution as the content is generated by AI and may contain errors or omissions. Multilingual evaluations are inherently bounded by the scope, representativeness, and annotation practices of public datasets, and performance observed on RTPLX may not fully generalize to all real-world contexts, domains, dialects, or usage patterns. Accordingly, the findings are intended to be informational purposes only and should not be considered professional advice.