Edit and Test AI Agents Iteratively in Playground

You can use Edit in Playground to edit and test your AI agents and large language model (LLM) nodes. This option enables iterative, real-time testing and refinement of agent instructions and parameters, without deploying changes to the production environment. It bridges the gap between high-level intent and step-by-step execution, so you can safely test, adjust, and fine-tune your agent’s behavior before deploying it to production.

To edit and test AI agents:
  1. Go to AI Agent Studio.
  2. Choose the appropriate tab based on your project:
    • Agent Teams: For collaborative or multi-agent flows
    • Agents: For individual agent configurations
  3. Search for the agent or agent team and then select Edit Edit to modify their settings.
  4. From the editing area, select Edit in Playground to edit and test your AI agents.
  5. Edit the agent details to adjust the prompt logic and model parameters. To edit and view real-time results in a dual-pane layout, you can select View Results View Results.

    Prompts Tab

    You can design templates and instructions that guide the agent’s behavior.
    Note: The system prompt and summarization prompts are specific to supervisor type of agents.
    Field Description
    System Prompt Edit the system prompt to define your agent’s identity, job description, and rules.
    Summarization Prompt Select the summarization mode, and edit the summarization prompts to include only essential instructions relevant to your specific use case.
    Note: You can add expressions to fields using Insert Expression Insert Expression. For more information, see Expressions in AI Agent Studio.

    LLM Tab

    Choose the type of model and change the custom model properties if needed. You can either use the default model or choose a model.
    Field Description
    Provider Choose to use the default model or select a model. When using a custom model, specify the model properties.

    Input Tab

    Enter the specific test data, variables, or context required for an agent or LLM node to run. You can simulate real-world triggers and verify that your agent processes incoming information correctly before saving your changes.
    Field Description
    User Input Enter your queries or actionable commands that initiates the agent's logic.
    Evaluation Use evaluation sets to assess your agent's performance. An evaluation set contains one or more test questions, the expected agent responses, and the metrics to be measured. For more information, see Evaluate Agents.
    Variables Configure variables, making them accessible to all nodes within the workflow type of agent.
    Additional Variables Add additional variables to the prompt. For example, to add the current system date to your prompt, select the Current Date and Time option using Insert Expression Insert Expression.

    Output Tab

    Configure the overall structure of the agent’s output using JSON schema, to specify the exact output.
    Field Description
    Specification Mode Select this mode to directly modify the JSON schema for the output.
    Simple Mode Select this mode to define the output values and types. The corresponding JSON schema will be generated automatically and displayed in the specification mode for any further changes.
  6. Select Save and Test to verify the changes you made and test the agent or the LLM node. While the test is running, you can monitor the results and performance in these sections:
    • Input: Review the messages you entered in the Input Message field on the Input tab. To view the JSON schema, select JSON. To view the final response, select Human.
    • Output: Select a session to open the detailed trace view. The trace shows a step-by-step timeline of the run, including which tools were called, how long each step took, and metrics for each step. Select the color-coded trace lines to view details, such as latency and input or output token usage. To view the final response, select Human.
  7. Select Run History Run History to view, compare, and reuse previous test runs. You can view a detailed record of every run and track your agent’s performance. 
    • Select Edit Edit to add a comment in the change log.
    • Select Apply Apply to load a proven configuration into the panel and resume iterative testing from a high-quality baseline. This action only affects the testing environment and won't update your production agent until deployed.
    • Use Expand Expand to view change log details of the test. 
    • You can compare any two runs by selecting them. Select Compare to view a pane displaying the details of each run and highlighting their differences.
    • Select Save and Test to save your changes to the agent, and run the agent with the test configuration.
  8. Select Done.
  9. After testing your changes, publish your agent team.