Agent Memory

Agent Memory enables agents to retain and use context across interactions. It supports both short-term memory within a conversation and long-term memory across conversations, helping improve continuity, relevance, and efficiency.

Short-Term Memory

Short-term memory refers to the conversation context carried forward within an ongoing conversation. The Responses API and Conversations API simplify conversation state management, enabling multi-turn interactions.

Long-Term Memory

Long-term memory provides persistent context across conversations. When enabled, the service extracts key information from conversations and stores it so it can be recalled in future interactions within the same project.

Long-term memory is useful for scenarios that require continuity across sessions, such as:

  • Remembering stable user preferences
  • Retaining recurring background context
  • Maintaining continuity across interactions

Short-Term Memory Compaction

As conversation history grows, sending the full history can increase token usage and latency. Short-term memory compaction summarizes and compresses earlier conversation history into a smaller, structured representation. This helps preserve key details while reducing the amount of context sent to the model.

This approach:

  • Preserves key information from earlier turns
  • Reduces token usage for long conversations
  • Improves latency by keeping context lightweight