3 Best Practices in using Data Science Agent
Follow these best practices to maximize the benefits of Data Science Agent.
Topics:
3.1 Recommended Models
Data Science Agent works with large language models accessed through
Oracle DBMS_CLOUD_AI andDBMS_CLOUD_AI_AGENT
packages. The DBMS_CLOUD_AI package, with Select AI, supports the
translation of natural language prompts to generate, run, explain SQL statements,
and also enables RAG and natural language-based interactions, including chats with
LLMs. For more information, see DBMS_CLOUD_AI Package and
DBMS_CLOUD_AI_AGENT Package.
Note:
Oracle recommends using GPT-4.1 as the preferred model for all Data Science Agent users for reliability, responsiveness, and cost-effectiveness.Table 3-1 Recommended models
| Large Language Models | Provider | Scenario | Strengths |
|---|---|---|---|
| GPT-4.1 | OpenAI or OCI GenAI | Use this model for all general-purpose data science tasks. |
|
| Grok-4 | OCI GenAI | Use this model for complex, multi-step tasks
requiring advanced reasoning.
The response time of Data Science Agent is expected to be slow when using Grok-4. |
Powerful reasoning capabilities |
Profile Creation for GPT-4.1
To create an AI profile for GPT-4.1 through Open AI, run the following script in a notebook:
DECLARE
profile_name VARCHAR2(128) := 'OPENAI_GPT_4_1';
BEGIN
dbms_cloud_ai.drop_profile(profile_name, TRUE);
dbms_cloud_ai.create_profile(
profile_name => profile_name,
attributes => '{
"comments": false,
"conversation": true,
"credential_name": "OPENAI_CRED",
"model": "gpt-4.1",
"provider": "openai",
"temperature": 1,
"max_tokens": 8192
}'
);
END;
/
To create an AI profile for GPT 4.1 through Oracle Cloud Infrastructure (OCI), run the following script in a notebook:
DECLARE
profile_name VARCHAR2(128) := 'OCI_GPT_4_1';
BEGIN
dbms_cloud_ai.drop_profile(profile_name, TRUE);
dbms_cloud_ai.create_profile(
profile_name => profile_name,
attributes => '{
"comments": false,
"conversation": true,
"credential_name": "OCI_CRED",
"model": "openai.gpt-4.1",
"provider": "oci",
"temperature": 1,
"max_tokens": 8192,
"oci_compartment_id": "<your-dep-id>",
"oci_apiformat": "GENERIC"
}'
);
END;
/
Profile Creation for Grok 4
To create an AI profile for Grok 4 through Oracle Cloud Infrastructure (OCI), run the following script in a notebook:
DECLARE
profile_name VARCHAR2(128) := 'OCI_GROK_4';
BEGIN
dbms_cloud_ai.drop_profile(profile_name, TRUE);
dbms_cloud_ai.create_profile(
profile_name => profile_name,
attributes => '{
"comments": false,
"conversation": true,
"credential_name": "OCI_CRED",
"model": "xai.grok-4",
"provider": "xAI",
"temperature": 1,
"max_tokens": 8192,
"oci_compartment_id": "<your-dep-id>",
"oci_apiformat": "GENERIC"
}'
);
END;
/
profile_name: A name for the AI profile. The profile name must follow the naming rules of Oracle SQL identifier. Maximum length of profile name is 125 characters.comments: Includes table and column comments in the metadata used for translating natural language prompts using AI.BOOLEANdata type is supported. The valid values areTRUEorFALSEfor a string withVARCHAR2data type. The values are not case sensitive.conversation: AVARCHAR2attribute that indicates if conversation history is enabled for a profile. Valid values are true or false. The default value is false. The values are not case sensitive.credential_name: The name of the credential to access the AI provider APIs.model: The name of the AI model being used to generate responses.provider: AI provider for the AI profile. This is a mandatory attribute.temperature: Controls the randomness of the model's output. Lower values, for example0, make the responses more deterministic and focused. Higher values, for example,1, make them more creative and varied. You may want to tune it depending on your use case.- Lower values are generally preferred for structured or factual tasks
- Higher values can be useful for more open-ended generation. temperature = 1 gives the best results.
max_tokens: Denotes the number of tokens to predict per generation. Default is 1024. Sets the maximum length of the agent's response. The default value of 1024 tokens may be too low for complex or verbose answers. Setting it to 4096 should be sufficient for most use cases, while 8192 provides extra headroom for longer responses.Note:
This can be an arbitrary number, but not strictly a power of 2.oci_compartment_id: Specifies the OCID of the compartment you are permitted to access when calling the OCI Generative AI service. The compartment ID can contain alphanumeric characters, hyphens and dots. The default is the compartment ID of the PDB.oci_apiformat:
For more information, see Manage AI Profiles.
Parent topic: Best Practices in using Data Science Agent
3.2 Associate Database objects to your conversation
Consider associating database objects such as tables, views and mining models to a Data Science Agent conversation. Once you associate these objects, the agent can inspect, analyze, transform, and model from those objects directly. This will thereby enhance the quality of the agent's response. If you do not associate any object, the agent will automatically scan the database for relevant objects based on your query.
Note:
Some operations such as feature ranking, model search, training can be compute-intensive and may take time.Parent topic: Best Practices in using Data Science Agent
3.3 Ask for clarification
- What was done in the previous step
- Why was a particular step necessary
- What is the next recommended step
- Explain the <concept>. For example, what is unstructured data in machine learning?
Parent topic: Best Practices in using Data Science Agent
3.4 Ask in multiple iterations
If you are using Data Science Agent for extended workflows, consider asking the agent in multiple iterations. Longer workflows are generally more effective when handled iteratively. For instance, you can start by creating a dataset view, then move to validating assumptions, and finally focus on model training and evaluation.
Parent topic: Best Practices in using Data Science Agent
3.5 Clarify terminology
If you use specific terms in your conversation, it is a good practice to clarify those terms to the agent.
Parent topic: Best Practices in using Data Science Agent
3.6 Follow suggestions provided by Data Science Agent
Follow the suggestion of the agent when appropriate. The agent frequently proposes the next steps of a workflow. For example, data preparation, analysis, model training. Accept or refine these suggestions for a smooth progress.
Parent topic: Best Practices in using Data Science Agent
3.7 Limit your conversation length and scope
- If your conversation has a lot of messages (around 50 messages), or
- If your objective changes
Parent topic: Best Practices in using Data Science Agent
3.8 Provide context to your conversation
The interaction with Data Science Agent is structured as a conversation, consisting
of alternating turns. A turn begins with your
prompt, followed by the agent’s response. A Data Science Agent conversation
maintains the context across turns.
Therefore, providing context to your conversation is a good practice, especially if you resume a conversation at a later time.
Parent topic: Best Practices in using Data Science Agent
3.9 Specify a clear objective
Clearly state your objective at the beginning of the conversation. For example, "I want to predict customer churn" or "I want to identify the main causes". Sharing a high-level intent early in the conversation helps guide the rest of the conversation. When the agent understands your objective, it can suggest the most appropriate workflow.
Parent topic: Best Practices in using Data Science Agent