vibe.llm_providers.base¶

Shared data structures and base interfaces for LLM providers.

ProviderConfig ¶

Common configuration fields for LLM providers.

This dataclass captures the settings that most LLM providers share. Provider subclasses can access these via self.provider_config and still add their own provider-specific settings as instance attributes.

Attributes:

model (str) –

The model identifier (e.g., "gpt-4", "claude-3-sonnet", "gemini-2.0-flash")
temperature (float | None) –

Controls response randomness (typically 0.0-1.0, some providers allow up to 2.0)
max_tokens (int | None) –

Maximum tokens to generate in the response
timeout (int | None) –

Request timeout in seconds
tools_config (bool | None) –

Explicit tools setting from config (None = use provider capability default)
api_key (str | None) –

API key for authentication (None for providers that don't need it, e.g., local Ollama)
base_url (str | None) –

Custom base URL for API endpoint (for third-party providers or self-hosted)

from_config ¶

from_config(config: dict[str, Any], *, default_model: str = 'unknown') -> ProviderConfig

Create ProviderConfig from a config dict.

Parameters:	`config` (`dict[str, Any]`) – Configuration dictionary (typically from endpoints.yml) `default_model` (`str`, default: `'unknown'` ) – Default model name if not specified in config

Returns:	`ProviderConfig` – ProviderConfig instance with parsed values

ConfigOption ¶

Schema for a configuration option exposed in dev UI.

Uses VIBE core types for consistency: - "number": Numeric input with min/max/step (rendered as slider or input) - "bool": Checkbox - "text": Text input - "select": Dropdown menu - "radio": Radio button group (for small option sets)

Attributes:

type (str) –

One of "number", "bool", "text", "select", "radio"
default (Any) –

Default value when not specified
label (str) –

Human-readable label for UI
description (str) –

Help text shown below the control
ui_exposed (bool) –

Whether to show in dev UI (False for sensitive fields)
group (str) –

UI grouping (e.g., "Basic", "Thinking", "Advanced")
min/max/step (str) –

Constraints for "number" type
options (list[dict[str, str]] | None) –

List of {value, label} dicts for "select"/"radio" types
requires_capability (str | None) –

Only show if provider has this capability

CostConfig ¶

Cost configuration for an LLM endpoint.

Pricing is specified per unit tokens (typically 1,000,000 for most providers).

Attributes:	`input` (`float`) – Cost per unit input tokens `output` (`float`) – Cost per unit output tokens `cached_input` (`float \| None`) – Cost per unit cached input tokens (optional, defaults to input cost) `currency` (`str`) – Currency code (e.g., "USD", "EUR") `unit` (`int`) – Number of tokens per pricing unit (e.g., 1000000 for "per million tokens")

from_config ¶

from_config(config: dict[str, Any] | None) -> Optional[CostConfig]

Create CostConfig from endpoint config dict, or None if not configured.

UsageStats ¶

Token usage statistics from an LLM API call.

Attributes:

input_tokens (int) –

Number of tokens in the prompt/input
output_tokens (int) –

Number of tokens in the completion/output
thinking_tokens (int | None) –

Number of tokens used for reasoning/thinking (e.g., Claude extended thinking)
cached_input_tokens (int | None) –

Number of input tokens served from cache (subset of input_tokens)
time_to_first_token (float | None) –

Time in seconds from request start to first token received (None if not measured)
time_to_last_token (float | None) –

Time in seconds from request start to last token received (None if not measured)
cost (float | None) –

Calculated cost for this API call (None if cost config not available)
currency (str | None) –

Currency of the cost (e.g., "USD")

total_tokens ¶

total_tokens: int

Total tokens consumed (input + output + thinking if present).

calculate_cost ¶

calculate_cost(cost_config: CostConfig | None) -> None

Calculate and set cost based on token counts and cost configuration.

ToolOutput ¶

Output from tool execution for Responses API submission.

Used to submit tool results back to the LLM in a subsequent request. This is distinct from ToolResult messages which are part of conversation history.

ChunkType ¶

Chunk categories emitted during streaming generation.

ConnectionFailureException ¶

Raised when LLM provider fails to establish connection.

EmptyResponseException ¶

Raised when LLM provider returns empty or invalid response.

RetryableServerError ¶

Raised when the LLM provider returns an intermittent server error (502, 503, 504) that should be retried.

RetryConfig ¶

Retry policy configuration for provider calls.

ProviderCapabilities ¶

Strongly-typed provider capabilities for feature gating and UI/logic decisions.

All fields are required to ensure new capabilities are explicitly declared by all providers. Using frozen=True makes this immutable and hashable.

StreamChunk ¶

Represents a streaming chunk from a provider.

LLMProvider ¶

Abstract base class for LLM providers.

Subclasses must implement: - convert_messages_to_provider_format() - stream_generate() - get_usage_stats()

The base class provides: - Common configuration via self.provider_config (ProviderConfig) - Logging context management - Playback from log files for debugging - Tool configuration

tools ¶

tools: list[Tool]

Get tools configured for this provider by AssistantService.

logging_context ¶

logging_context(*, sequence_number: int | None, session_id: str | None, assistant_name: str | None, endpoint_name: str | None, turn_id: str | None) -> Generator[None, None, None]

Context manager to scope logging metadata for a streaming run.

load_logged_session ¶

load_logged_session(log_path: str | Path, session_id: str | None = None) -> list[dict[str, Any]]

Return chronological request/response segments for a session from a log file.

If session_id is None, loads the first session found in the log file.

iter_recorded_responses ¶

iter_recorded_responses(log_path: str | Path, session_id: str | None = None, *, sequence: int | None = None, turn_id: str | None = None) -> Generator[dict[str, Any], None, None]

Yield recorded response payloads for a session filtered by sequence/turn.

If session_id is None, uses the first session found in the log file.

get_unified_system_prompt_template ¶

get_unified_system_prompt_template(template_config: dict[str, Any] | None = None) -> Template

Get the system prompt template from file.

set_tools ¶

set_tools(tools: list[Tool]) -> None

Set tools for this provider. Called by AssistantService.

set_vector_store_ids ¶

set_vector_store_ids(vector_store_ids: list[str]) -> None

Set vector store IDs for file_search tool.

Called by AssistantService when reference_documents are configured. Only meaningful for providers that support file_search capability.

Parameters:	`vector_store_ids` (`list[str]`) – List of OpenAI vector store IDs

convert_messages_to_provider_format ¶

convert_messages_to_provider_format(messages: list[Message], tools: list[Tool] | None = None) -> object

Convert internal messages and tools to provider-specific format.

This method transforms our provider-agnostic message types into the format expected by this specific LLM provider's API.

Parameters:	`messages` (`list[Message]`) – List of internal Message objects (SystemMessage, UserMessage, etc.) `tools` (`list[Tool] \| None`, default: `None` ) – Optional list of Tool objects available for the LLM to call

Returns:	`object` – Provider-specific format (structure varies by provider)

Examples:

OpenAI: Returns (messages_list, tools_list, tool_outputs_list)
Anthropic: Returns (messages_list, tools_list)
Others: Returns messages_list with tools embedded

stream_generate ¶

stream_generate(messages: list[Message], sequence_number: int, *, session_id: str, assistant_name: str, endpoint_name: str, turn_id: str, previous_response_id: str | None = None, tool_outputs: list[ToolOutput] | None = None, unanswered_predefined_questions: list[dict[str, Any]] | None = None) -> Generator[StreamChunk, None, None]

Stream LLM responses with tool call support.

Parameters:

messages (list[Message]) –

Conversation history as typed Message objects
sequence_number (int) –

Turn sequence number for conversation ordering
session_id (str) –

Session identifier for logging and correlation
assistant_name (str) –

Assistant display name for logging
endpoint_name (str) –

LLM endpoint identifier for logging and metrics
turn_id (str) –

Unique turn identifier for request/response correlation
previous_response_id (str | None, default: None ) –

Provider-specific response ID for Responses API continuation (optional)
tool_outputs (list[ToolOutput] | None, default: None ) –

Tool execution results to submit with this request (optional)
unanswered_predefined_questions (list[dict[str, Any]] | None, default: None ) –

List of predefined questions (dicts with 'id', 'label', 'type') for injecting into ask_question tool description (optional)

Yields:	`StreamChunk` – StreamChunk objects containing text deltas or tool call invocations

get_usage_stats ¶

get_usage_stats() -> Optional[UsageStats]

Return usage statistics from the last API call.

Returns:	`Optional[UsageStats]` – UsageStats instance with token counts, or None if no usage stats are available.

get_last_response_id ¶

get_last_response_id() -> str | None

Return provider-specific response id for Responses API continuation.

Only providers that implement a Responses-style continuation mechanism should override this method. Callers must be prepared for NotImplementedError when the provider doesn't support response ids.

get_capabilities ¶

get_capabilities() -> ProviderCapabilities

Return provider capabilities for feature gating and UI/logic decisions.

get_effective_tools_enabled ¶

get_effective_tools_enabled() -> bool

Resolve whether tools should be used for this provider.

Config setting takes precedence over provider capability: - If config explicitly sets 'tools: true/false', use that value - Otherwise, fall back to provider's capability default

This allows admins to: - Enable tools on providers that default to disabled (if they work) - Disable tools on providers that default to enabled (to use JSON fallback)

get_config_schema ¶

get_config_schema() -> dict[str, ConfigOption]

Return the full config schema for this provider.

Merges BASE_CONFIG_SCHEMA with provider-specific CONFIG_SCHEMA. Provider-specific options override base options with the same key.

get_ui_config_schema ¶

get_ui_config_schema() -> dict[str, ConfigOption]

Return config schema filtered for dev UI display.

Filters out: - Options with ui_exposed=False - Options requiring capabilities this provider doesn't have

get_current_config_values ¶

get_current_config_values() -> dict[str, Any]

Return current config values for UI display.

Returns values from self.config, falling back to schema defaults.