7.4 Best Practices for using Data Science Agent

Follow these best practices to maximize the benefits of Data Science Agent.

Topics:

7.4.1 Recommended Models

Data Science Agent works with large language models accessed through Oracle DBMS_CLOUD_AI and DBMS_CLOUD_AI_AGENT packages. The DBMS_CLOUD_AI package, with Select AI, supports the translation of natural language prompts to generate, run, explain SQL statements, and also enables RAG and natural language-based interactions, including chats with LLMs. For more information, see DBMS_CLOUD_AI Package and DBMS_CLOUD_AI_AGENT Package.

This table lists the recommended large language models and the scenarios in which each should be used.

Note:

Recommended models may change as providers update model availability, latency, pricing, and quality. Check the current list of supported models for your AI provider before creating a profile.

Table 7-1 Recommended Models

Provider Tier Large Language Model
OpenAI or OCI GenAI Top gpt-5.5
OpenAI or OCI GenAI Cost-effective gpt-5.4-mini
OCI GenAI Top xai.grok-4.3

Note:

When using OCI Generative AI, use the provider and model identifiers exactly as documented for OCI GenAI. Some model identifiers may include the original model family or vendor name.
OCI GenAI Cost-effective xai.grok-4-1-fast-reasoning

Note:

When using OCI Generative AI, use the provider and model identifiers exactly as documented for OCI GenAI. Some model identifiers may include the original model family or vendor name.
Google Top gemini-3.5-flash
Google Cost-effective gemini-3-flash-preview
Anthropic Top claude-opus-4-8
Anthropic Cost-effective claude-sonnet-4-6
Explanation of tiers:
  • Top: Represents the best state-of-the-art model from a specific provider. This tier is the strongest option in terms of quality, reliability, and precision.
  • Cost-effective: Represents a good compromise between quality, cost, and speed. These models are typically faster and less expensive. But the trade-off is lower quality and reliability compared to the Top tier.

Profile Creation for GPT-5.5

To create an AI profile for GPT-5.5 through OpenAI, run the following script in a notebook:

DECLARE
    profile_name VARCHAR2(128) := 'OPENAI_GPT_5_5';
BEGIN
    dbms_cloud_ai.drop_profile(profile_name, TRUE);
    dbms_cloud_ai.create_profile(
        profile_name => profile_name,
        attributes => '{
            "credential_name": "OPENAI_CRED",
            "model": "gpt-5.5",
            "provider": "openai",
            "temperature": 1,
            "max_tokens": 8192
        }'
    );
END;
/

To create an AI profile for GPT 5.5 through Oracle Cloud Infrastructure (OCI), run the following script in a notebook:

DECLARE
    profile_name VARCHAR2(128) := 'OCI_GPT_5_5';
BEGIN
    dbms_cloud_ai.drop_profile(profile_name, TRUE);
    dbms_cloud_ai.create_profile(
        profile_name => profile_name,
        attributes => '{
            "credential_name": "OCI_CRED",
            "model": "openai.gpt-5.5",
            "provider": "oci",
            "temperature": 1,
            "max_tokens": 8192,
            "oci_compartment_id": "<your-dep-id>"
          }'
        );
    END;
/

Profile Creation for Grok 4.3

To create an AI profile for Grok 4.3 through Oracle Cloud Infrastructure (OCI), run the following script in a notebook:

DECLARE
    profile_name VARCHAR2(128) := 'OCI_GROK_4_3';
BEGIN
        dbms_cloud_ai.drop_profile(profile_name, TRUE);
        dbms_cloud_ai.create_profile(
            profile_name => profile_name,
            attributes => '{
                "credential_name": "OCI_CRED",
                "model": "xai.grok-4.3",
                "provider": "oci",
                "temperature": 1,
                "max_tokens": 8192,    
                "oci_compartment_id": "<your-dep-id>"
               }'
            );
    END;
/
Parameters:
  • profile_name: Name of the AI profile. It must follow Oracle SQL identifier naming rules.
  • credential_name: Name of the credential used to authenticate with the selected AI provider.
  • model: Model identifier used by the selected provider.
  • provider: AI provider for the profile, for example openai, oci, google, or anthropic.
  • temperature: Recommended value for Data Science Agent examples: 1.
  • max_tokens: Recommended value for these examples: 8192.

    Note:

    Data Science Agent profile inspection treats values below 4096 as not recommended.
  • oci_compartment_id: Required for the OCI GenAI examples. Use the target compartment OCID or documented deployment/compartment identifier.

For more information, see Manage AI Profiles.

7.4.2 Associate Database objects to your conversation

Consider associating database objects such as tables, views, and mining models to a Data Science Agent conversation. Once you associate these objects, the agent can inspect, analyze, transform, and model from those objects directly. This enhances the quality of the agent's response. If you do not associate any object, the agent will automatically scan the database for relevant objects based on your query.

Note:

Some operations such as feature ranking, model search, training can be compute-intensive and may take time.

7.4.3 Ask for clarification

During the course of your conversation, you can ask for clarification at any time. Some examples:
  • What was done in the previous step
  • Why was a particular step necessary
  • What is the next recommended step
  • Explain the <concept>. For example, what is unstructured data in machine learning?

7.4.4 Ask in multiple iterations

If you are using Data Science Agent for extended workflows, consider asking the agent in multiple iterations. Longer workflows are generally more effective when handled iteratively. For instance, you can start by creating a dataset view, then move to validating assumptions, and finally focus on model training and evaluation.

7.4.5 Clarify terminology

If you use specific terms in your conversation, it is a good practice to clarify those terms to the agent.

7.4.6 Follow suggestions provided by Data Science Agent

Follow the suggestion of the agent when appropriate. The agent frequently proposes the next steps of a workflow. For example, data preparation, analysis, model training. Accept or refine these suggestions for a smooth progress.

7.4.7 Limit your conversation length and scope

Although Data Science Agent can handle extended interactions, very long conversations may gather context that negatively affects clarity or performance. For extended work, consider starting a new conversation, especially if you encounter these situations:
  • If your conversation has a lot of messages (around 50 messages), or
  • If your objective changes

7.4.8 Provide context to your conversation

The interaction with Data Science Agent is structured as a conversation, consisting of alternating turns. A turn begins with your prompt, followed by the agent’s response. A Data Science Agent conversation maintains the context across turns.

Therefore, providing context to your conversation is a good practice, especially if you resume a conversation at a later time.

7.4.9 Specify a clear objective

Clearly state your objective at the beginning of the conversation. For example, "I want to predict customer churn" or "I want to identify the main causes". Sharing a high-level intent early in the conversation helps guide the rest of the conversation. When the agent understands your objective, it can suggest the most appropriate workflow.