vibe.llm_providers.base

Shared data structures and base interfaces for LLM providers.

ProviderConfig

Common configuration fields for LLM providers.

This dataclass captures the settings that most LLM providers share. Provider subclasses can access these via self.provider_config and still add their own provider-specific settings as instance attributes.

Attributes:
  • model (str) –

    The model identifier (e.g., "gpt-4", "claude-3-sonnet", "gemini-2.0-flash")

  • temperature (float | None) –

    Controls response randomness (typically 0.0-1.0, some providers allow up to 2.0)

  • max_tokens (int | None) –

    Maximum tokens to generate in the response

  • timeout (int | None) –

    Request timeout in seconds

  • tools_config (bool | None) –

    Explicit tools setting from config (None = use provider capability default)

  • api_key (str | None) –

    API key for authentication (None for providers that don't need it, e.g., local Ollama)

  • base_url (str | None) –

    Custom base URL for API endpoint (for third-party providers or self-hosted)

from_config

from_config(config: dict[str, Any], *, default_model: str = 'unknown') -> ProviderConfig

Create ProviderConfig from a config dict.

Parameters:
  • config (dict[str, Any]) –

    Configuration dictionary (typically from endpoints.yml)

  • default_model (str, default: 'unknown' ) –

    Default model name if not specified in config

Returns:

ConfigOption

Schema for a configuration option exposed in dev UI.

Uses VIBE core types for consistency: - "number": Numeric input with min/max/step (rendered as slider or input) - "bool": Checkbox - "text": Text input - "select": Dropdown menu - "radio": Radio button group (for small option sets)

Attributes:
  • type (str) –

    One of "number", "bool", "text", "select", "radio"

  • default (Any) –

    Default value when not specified

  • label (str) –

    Human-readable label for UI

  • description (str) –

    Help text shown below the control

  • ui_exposed (bool) –

    Whether to show in dev UI (False for sensitive fields)

  • group (str) –

    UI grouping (e.g., "Basic", "Thinking", "Advanced")

  • min/max/step (str) –

    Constraints for "number" type

  • options (list[dict[str, str]] | None) –

    List of {value, label} dicts for "select"/"radio" types

  • requires_capability (str | None) –

    Only show if provider has this capability

CostConfig

Cost configuration for an LLM endpoint.

Pricing is specified per unit tokens (typically 1,000,000 for most providers).

Attributes:
  • input (float) –

    Cost per unit input tokens

  • output (float) –

    Cost per unit output tokens

  • cached_input (float | None) –

    Cost per unit cached input tokens (optional, defaults to input cost)

  • currency (str) –

    Currency code (e.g., "USD", "EUR")

  • unit (int) –

    Number of tokens per pricing unit (e.g., 1000000 for "per million tokens")

from_config

from_config(config: dict[str, Any] | None) -> Optional[CostConfig]

Create CostConfig from endpoint config dict, or None if not configured.

UsageStats

Token usage statistics from an LLM API call.

Attributes:
  • input_tokens (int) –

    Number of tokens in the prompt/input

  • output_tokens (int) –

    Number of tokens in the completion/output

  • thinking_tokens (int | None) –

    Number of tokens used for reasoning/thinking (e.g., Claude extended thinking)

  • cached_input_tokens (int | None) –

    Number of input tokens served from cache (subset of input_tokens)

  • time_to_first_token (float | None) –

    Time in seconds from request start to first token received (None if not measured)

  • time_to_last_token (float | None) –

    Time in seconds from request start to last token received (None if not measured)

  • cost (float | None) –

    Calculated cost for this API call (None if cost config not available)

  • currency (str | None) –

    Currency of the cost (e.g., "USD")

total_tokens

total_tokens: int

Total tokens consumed (input + output + thinking if present).

calculate_cost

calculate_cost(cost_config: CostConfig | None) -> None

Calculate and set cost based on token counts and cost configuration.

ToolOutput

Output from tool execution for Responses API submission.

Used to submit tool results back to the LLM in a subsequent request. This is distinct from ToolResult messages which are part of conversation history.

ChunkType

Chunk categories emitted during streaming generation.

ConnectionFailureException

Raised when LLM provider fails to establish connection.

EmptyResponseException

Raised when LLM provider returns empty or invalid response.

RetryableServerError

Raised when the LLM provider returns an intermittent server error (502, 503, 504) that should be retried.

RetryConfig

Retry policy configuration for provider calls.

ProviderCapabilities

Strongly-typed provider capabilities for feature gating and UI/logic decisions.

All fields are required to ensure new capabilities are explicitly declared by all providers. Using frozen=True makes this immutable and hashable.

StreamChunk

Represents a streaming chunk from a provider.

LLMProvider

Abstract base class for LLM providers.

Subclasses must implement: - convert_messages_to_provider_format() - stream_generate() - get_usage_stats()

The base class provides: - Common configuration via self.provider_config (ProviderConfig) - Logging context management - Playback from log files for debugging - Tool configuration

tools

tools: list[Tool]

Get tools configured for this provider by AssistantService.

logging_context

logging_context(*, sequence_number: int | None, session_id: str | None, assistant_name: str | None, endpoint_name: str | None, turn_id: str | None) -> Generator[None, None, None]

Context manager to scope logging metadata for a streaming run.

load_logged_session

load_logged_session(log_path: str | Path, session_id: str | None = None) -> list[dict[str, Any]]

Return chronological request/response segments for a session from a log file.

If session_id is None, loads the first session found in the log file.

iter_recorded_responses

iter_recorded_responses(log_path: str | Path, session_id: str | None = None, *, sequence: int | None = None, turn_id: str | None = None) -> Generator[dict[str, Any], None, None]

Yield recorded response payloads for a session filtered by sequence/turn.

If session_id is None, uses the first session found in the log file.

get_unified_system_prompt_template

get_unified_system_prompt_template(template_config: dict[str, Any] | None = None) -> Template

Get the system prompt template from file.

set_tools

set_tools(tools: list[Tool]) -> None

Set tools for this provider. Called by AssistantService.

set_vector_store_ids

set_vector_store_ids(vector_store_ids: list[str]) -> None

Set vector store IDs for file_search tool.

Called by AssistantService when reference_documents are configured. Only meaningful for providers that support file_search capability.

Parameters:
  • vector_store_ids (list[str]) –

    List of OpenAI vector store IDs

convert_messages_to_provider_format

convert_messages_to_provider_format(messages: list[Message], tools: list[Tool] | None = None) -> object

Convert internal messages and tools to provider-specific format.

This method transforms our provider-agnostic message types into the format expected by this specific LLM provider's API.

Parameters:
  • messages (list[Message]) –

    List of internal Message objects (SystemMessage, UserMessage, etc.)

  • tools (list[Tool] | None, default: None ) –

    Optional list of Tool objects available for the LLM to call

Returns:
  • object

    Provider-specific format (structure varies by provider)

Examples:

  • OpenAI: Returns (messages_list, tools_list, tool_outputs_list)
  • Anthropic: Returns (messages_list, tools_list)
  • Others: Returns messages_list with tools embedded

stream_generate

stream_generate(messages: list[Message], sequence_number: int, *, session_id: str, assistant_name: str, endpoint_name: str, turn_id: str, previous_response_id: str | None = None, tool_outputs: list[ToolOutput] | None = None, unanswered_predefined_questions: list[dict[str, Any]] | None = None) -> Generator[StreamChunk, None, None]

Stream LLM responses with tool call support.

Parameters:
  • messages (list[Message]) –

    Conversation history as typed Message objects

  • sequence_number (int) –

    Turn sequence number for conversation ordering

  • session_id (str) –

    Session identifier for logging and correlation

  • assistant_name (str) –

    Assistant display name for logging

  • endpoint_name (str) –

    LLM endpoint identifier for logging and metrics

  • turn_id (str) –

    Unique turn identifier for request/response correlation

  • previous_response_id (str | None, default: None ) –

    Provider-specific response ID for Responses API continuation (optional)

  • tool_outputs (list[ToolOutput] | None, default: None ) –

    Tool execution results to submit with this request (optional)

  • unanswered_predefined_questions (list[dict[str, Any]] | None, default: None ) –

    List of predefined questions (dicts with 'id', 'label', 'type') for injecting into ask_question tool description (optional)

Yields:
  • StreamChunk

    StreamChunk objects containing text deltas or tool call invocations

get_usage_stats

get_usage_stats() -> Optional[UsageStats]

Return usage statistics from the last API call.

Returns:
  • Optional[UsageStats]

    UsageStats instance with token counts, or None if no usage stats are available.

get_last_response_id

get_last_response_id() -> str | None

Return provider-specific response id for Responses API continuation.

Only providers that implement a Responses-style continuation mechanism should override this method. Callers must be prepared for NotImplementedError when the provider doesn't support response ids.

get_capabilities

get_capabilities() -> ProviderCapabilities

Return provider capabilities for feature gating and UI/logic decisions.

get_effective_tools_enabled

get_effective_tools_enabled() -> bool

Resolve whether tools should be used for this provider.

Config setting takes precedence over provider capability: - If config explicitly sets 'tools: true/false', use that value - Otherwise, fall back to provider's capability default

This allows admins to: - Enable tools on providers that default to disabled (if they work) - Disable tools on providers that default to enabled (to use JSON fallback)

get_config_schema

get_config_schema() -> dict[str, ConfigOption]

Return the full config schema for this provider.

Merges BASE_CONFIG_SCHEMA with provider-specific CONFIG_SCHEMA. Provider-specific options override base options with the same key.

get_ui_config_schema

get_ui_config_schema() -> dict[str, ConfigOption]

Return config schema filtered for dev UI display.

Filters out: - Options with ui_exposed=False - Options requiring capabilities this provider doesn't have

get_current_config_values

get_current_config_values() -> dict[str, Any]

Return current config values for UI display.

Returns values from self.config, falling back to schema defaults.