LLMs and Embedders
This page presents the abstract interfaces used to plug LLMs and embedders into Oracle Agent Memory.
LLM Interface
class oracleagentmemory.apis.llms.ILlm
Bases: ABC
Abstract interface for LLM invocation.
method generate (abstract)
Generate a response from an LLM synchronously.
- Parameters:
- prompt
str | Sequence[dict[str, str]]– Either a plain text prompt (treated as a single user message) or a chat-style list of messages, where each message is a mapping with at least a"content"key and optionally a"role". - response_json_schema
dict[str, Any] | None– Optional JSON Schema describing the expected response format. - **kwargs (Any) – Provider-specific keyword arguments forwarded to the underlying backend.
- prompt
- Returns: Normalized LLM output.
- Return type: LlmResponse
method generate_async (abstract, async)
Asynchronously generate a response from an LLM.
- Parameters:
- prompt
str | Sequence[dict[str, str]]– Either a plain text prompt (treated as a single user message) or a chat-style list of messages, where each message is a mapping with at least a"content"key and optionally a"role". - response_json_schema
dict[str, Any] | None– Optional JSON Schema describing the expected response format. - **kwargs (Any) – Provider-specific keyword arguments forwarded to the underlying backend.
- prompt
- Returns: Normalized LLM output.
- Return type: LlmResponse
LLM Responses
class oracleagentmemory.apis.llms.LlmResponse
Bases: object
A small normalized response returned by ILlm.
- Parameters:
text
str
text
The primary generated text content.
- Type: str
Embedder Interface
class oracleagentmemory.apis.IEmbedder
Bases: ABC
Abstract interface for text embedders.
method embed (abstract)
Embed a batch of texts into a 2D float32 NumPy array.
- Parameters:
- texts
list[str]– Batch of texts to embed. - is_query
bool– Whether the batch is being embedded for query-time retrieval.
- texts
- Returns:
A 2D array shaped
(len(texts), dim)withdtype=float32. - Return type: numpy.ndarray
method embed_async (abstract, async)
Embed a batch of texts into a 2D float32 NumPy array.
- Parameters:
- texts
list[str]– Batch of texts to embed. - is_query
bool– Whether the batch is being embedded for query-time retrieval.
- texts
- Returns:
A 2D array shaped
(len(texts), dim)withdtype=float32. - Return type: numpy.ndarray
LiteLLM Adapters
class oracleagentmemory.core.llms.Llm
Bases: ILlm
Adapter leveraging litellm to produce chat completions.
Create a LiteLLM-backed LLM adapter.
- Parameters:
- model
str– model identifier passed asmodel=...tolitellm.completion. - **default_kwargs (Any) – Default keyword arguments forwarded to LiteLLM for every call
(e.g.
temperature=0.2,max_tokens=256).
- model
method generate
Generate a response.
- Parameters:
- prompt
str | Sequence[dict[str, str]]– Prompt string or chat messages. A string is treated as a single user message. - response_json_schema
dict[str, Any] | None– Optional JSON Schema describing the expected response format. When provided, this method uses the provider-native structured output mechanism via OpenAI-compatibleresponse_format. - **kwargs (Any) – Additional call parameters forwarded to LiteLLM.
- prompt
- Returns: Normalized LLM output.
- Return type: LlmResponse
method generate_async (async)
Asynchronously generate a response using LiteLLM.
- Parameters:
- prompt
str | Sequence[dict[str, str]]– Prompt string or chat messages. A string is treated as a single user message. - response_json_schema
dict[str, Any] | None– Optional JSON Schema describing the expected response format. When provided, this method uses the provider-native structured output mechanism via OpenAI-compatibleresponse_format. - **kwargs (Any) – Additional call parameters forwarded to LiteLLM.
- prompt
- Returns: Normalized LLM output.
- Return type: LlmResponse
class oracleagentmemory.core.embedders.Embedder
Bases: IEmbedder
LiteLLM-backed embedder
- Parameters:
- model
str– Model identifier supported by LiteLLM. - normalize
bool– Whether to L2-normalize embeddings returned by LiteLLM. - api_base
str | None– Optional base URL for LiteLLM to target custom deployments. - api_key
str | None– Optional API key forwarded to LiteLLM when contacting the provider. - embedding_kwargs
dict[str, Any] | None– Additional keyword arguments forwarded tolitellm.embedding(). - query_prefix
str | None
- model
Notes
The LiteLLM client is imported only when the embedder is first used,
keeping optional dependency costs low for applications that do not rely
on LiteLLM. Connection details such as api_base and api_key are
merged into the call to litellm.embedding when provided.
method embed
Embed a batch of texts using LiteLLM.
- Parameters:
- texts
list[str]– Batch of raw text strings to embed. - is_query
bool
- texts
- Returns:
- numpy.ndarray – A two-dimensional
float32matrix with the embedding vectors returned by LiteLLM. - is_query – Whether the text is a query or not, might be required for specific embedding models.
- numpy.ndarray – A two-dimensional
- Raises: RuntimeError – If the LiteLLM response payload does not include embedding data.
- Return type: ndarray
Examples
Simple single-text embedding with a configured LiteLLM embedder:
vector = embedder.embed(["ping"])
vector.shape[0]
1
method embed_async (async)
Embed a batch of texts using LiteLLM.
- Parameters:
- texts
list[str]– Batch of raw text strings to embed. - is_query
bool– Whether the text is a query or not, might be required for specific embedding models.
- texts
- Returns:
A two-dimensional
float32matrix with the embedding vectors returned by LiteLLM. - Return type: numpy.ndarray
- Raises: RuntimeError – If the LiteLLM response payload does not include embedding data.