vibe.review.llm

LLM client for document classification.

Provides both mock and real LLM clients for requirement classification. Mock client is used for testing and development; real clients integrate with Berget API or other OpenAI-compatible endpoints.

Design Note: Separate Implementation from vibe.llm_providers

This module intentionally provides its own LLM clients rather than reusing the providers from vibe.llm_providers (used by Assistant). The rationale:

  1. Module Separation: Per the implementation plan (Section 1.7), Review is designed as a separate package (vibe-review) that depends on Core but avoids deep coupling. Core's llm_providers are tightly integrated with Assistant's conversation model (streaming, tool calling, multi-turn).

  2. Different Use Cases:

  3. Assistant: Interactive conversation with streaming responses, tool calling, multi-turn context, complex message history management
  4. Review: Single-shot classification with structured JSON output, no streaming, no tools, no conversation history

  5. Structured Output: Review requires strict JSON schema enforcement via the response_format parameter for deterministic classification results. This is a specific feature that Review uses extensively but Assistant doesn't need in the same way.

  6. Simpler Interface: Review's classify() method is purpose-built for classification tasks: system prompt + user prompt → structured response. No need for the complexity of conversation management.

  7. Dependency Footprint: This implementation uses httpx directly rather than the openai SDK, keeping Review's dependencies minimal.

If Review's needs grow closer to Assistant's (e.g., multi-turn classification discussions), consider refactoring to share infrastructure. For now, the ~100 lines of client code is simpler than adapting Core's providers.

See Also: - vibe.llm_providers.openai.OpenAIProvider (Core's provider for Assistant) - vibe.embedding_providers (shared embedding abstraction) - vibe.rerank_providers (shared reranking abstraction)

RelevanceResult

Result of relevance classification.

ComplianceResult

Result of compliance classification.

ClassificationResponse

Structured response from LLM classification.

StructuredClassificationResponse

Structured JSON response from LLM with metadata.

LLMClientConfig

Configuration for LLM client.

BaseLLMClient

Abstract base class for LLM clients.

classify

classify(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> ClassificationResponse

Send classification request to LLM.

Parameters:
  • system_prompt (str) –

    System instructions for the model.

  • user_prompt (str) –

    The actual classification request.

  • response_schema (dict[str, Any]) –

    JSON schema for structured output.

Returns:

classify_structured

classify_structured(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> StructuredClassificationResponse

Send a structured output request to the LLM and return the parsed JSON.

Subclasses that support arbitrary schemas should override this.

close

close() -> None

Close any open connections.

MockLLMClient

Mock LLM client for testing and development.

Features: - Deterministic responses based on input hash - Configurable default responses - Call tracking for test assertions - Specific response injection for tests - Question-specific response injection for question answering tests

set_question_response

set_question_response(question_id: str, answer: str | bool | list[str] | None, confidence: str = 'M', reasoning: str = 'Mock reasoning', supporting_part_ids: list[str] | None = None, needs_user_input: bool = False) -> None

Set a specific response for a question ID in question answering.

Used for testing AI-suggested question answers.

Parameters:
  • question_id (str) –

    The question ID to match in prompts

  • answer (str | bool | list[str] | None) –

    The answer value (type depends on question type)

  • confidence (str, default: 'M' ) –

    Confidence level (H/M/L)

  • reasoning (str, default: 'Mock reasoning' ) –

    Explanation for the answer

  • supporting_part_ids (list[str] | None, default: None ) –

    Document part IDs that support the answer

  • needs_user_input (bool, default: False ) –

    Whether user input is needed

classify

classify(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> ClassificationResponse

Generate mock classification response.

classify_structured

classify_structured(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> StructuredClassificationResponse

Generate mock structured response for arbitrary schemas.

set_response

set_response(system_prompt: str, user_prompt: str, response: ClassificationResponse) -> None

Set specific response for a prompt combination.

set_response_for_text

set_response_for_text(text_contains: str, response: ClassificationResponse) -> None

Set response for any prompt containing specific text.

clear_calls

clear_calls() -> None

Clear call history.

assert_called

assert_called() -> None

Assert that the client was called at least once.

assert_call_count

assert_call_count(expected: int) -> None

Assert specific number of calls.

close

close() -> None

No-op for mock client.

OpenAICompatibleClient

LLM client for OpenAI-compatible APIs (Berget, vLLM, etc.).

Requires structured output support from the model.

classify

classify(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> ClassificationResponse

Send classification request to OpenAI-compatible API.

classify_structured

classify_structured(system_prompt: str, user_prompt: str, response_schema: dict[str, Any]) -> StructuredClassificationResponse

Send structured output request to OpenAI-compatible API.

close

close() -> None

Close HTTP client.

enable_prompt_logging

enable_prompt_logging(log_dir: str | Path | None = None) -> Path

Enable logging of LLM prompts to files.

Parameters:
  • log_dir (str | Path | None, default: None ) –

    Directory for log files. If None, uses DataDirectoryManager.

Returns:
  • Path

    Path to the log file being used.

disable_prompt_logging

disable_prompt_logging() -> None

Disable prompt logging.

configure_mock_llm

configure_mock_llm(deterministic: bool = True, default_classification: str = 'Y', default_confidence: str = 'M', default_reasoning: str = 'Mock classification reasoning.', responses: dict[str, dict[str, Any]] | None = None) -> None

Configure the mock LLM client behavior.

Parameters:
  • deterministic (bool, default: True ) –

    If True, same input produces same output.

  • default_classification (str, default: 'Y' ) –

    Default classification result.

  • default_confidence (str, default: 'M' ) –

    Default confidence level.

  • default_reasoning (str, default: 'Mock classification reasoning.' ) –

    Default reasoning text.

  • responses (dict[str, dict[str, Any]] | None, default: None ) –

    Dict mapping prompt hashes to specific responses.

clear_mock_llm_config

clear_mock_llm_config() -> None

Clear mock LLM configuration.

set_mock_response

set_mock_response(prompt_hash: str, response: ClassificationResponse) -> None

Set a specific response for a prompt hash.

create_llm_client

create_llm_client(provider: str = 'mock', **kwargs: object) -> BaseLLMClient

Create an LLM client.

Parameters:
  • provider (str, default: 'mock' ) –

    "mock" or "openai_compatible"

  • **kwargs (object, default: {} ) –

    Configuration options passed to LLMClientConfig

Returns: