Skip to content

AI Assistant Architecture

Navigation guide for VIBE's AI-assisted interview system. Points you to the right source files for understanding implementation details.

See also:

  • core.md - Core VIBE engine
  • components.md - Component system
  • frontend.md - Frontend UI patterns
  • llm-providers.md - LLM provider abstraction (base classes, available providers, replay system)
  • review.md - Document compliance review

1. SYSTEM OVERVIEW

The Assistant feature enables LLM-powered document generation through conversational interviews. LLMs use tools to ask questions, collect information, and incrementally build draft documents.

Core Architecture:

  • Service-Oriented: Decomposed into focused collaborators (see Section 2)
  • Provider Abstraction: Multiple LLM backends via common interface (see llm-providers.md)
  • Streaming-First: Real-time UI updates via Server-Sent Events (SSE)
  • Tool-Based: LLM controls interview via function calls

Interview Modes:

  1. standard -- Traditional form-first (no assistant)
  2. assistant -- Assistant drives entire interview
Web Routes (HTTP/SSE) -> AssistantService (facade) -> LLM Providers -> External APIs
                              |                          |
                     Service Components          StreamTranslator -> SSE Events -> Browser
                              |
                      Session Storage

2. SERVICE DECOMPOSITION

2.1 Service Architecture

Location: vibe/assistant/services/

AssistantService (facade - thin coordinator)
    |
    +-- ConversationStateStore (session read/write, history serialization)
    +-- ProviderFactory (endpoint resolution, dev overrides, playback)
    +-- DraftManager (block IDs, storage, draft rendering)
    +-- TurnOrchestrator (retry logic, auto-reply, tool call validation)
    +-- QuestionSessionState (question state tracking across components)

2.2 Service Files

Component Location
AssistantService (facade) vibe/assistant/services/assistant_service.py
ConversationStateStore vibe/assistant/services/conversation_state_store.py
ProviderFactory vibe/assistant/services/provider_factory.py
DraftManager vibe/assistant/services/draft_manager.py
TurnOrchestrator vibe/assistant/services/turn_orchestrator.py
QuestionSessionState vibe/assistant/services/question_session_state.py
SessionAccessor vibe/assistant/services/session_accessor.py
StreamTranslator vibe/assistant/services/stream_translator.py
ReferenceDocuments vibe/assistant/services/reference_documents.py
VectorStoreManager vibe/assistant/services/vector_store_manager.py

2.3 ModelContextManager Components

Location: vibe/assistant/

Component Location Purpose
HistoryPruner vibe/assistant/history_pruner.py Prune history for checkpoint/consolidated strategies
UserInputProcessor vibe/assistant/user_input_processor.py Security markers, input transformation
SystemPromptBuilder vibe/assistant/system_prompt_builder.py Build system prompts from templates
ModelContextManager vibe/assistant/context.py Orchestrates above components

3. HOW IT WORKS

3.1 Initialization

Each assistant conversation begins with a system prompt combined with an initial user prompt from {% assistant %}...{% endassistant %} tags in the template. VIBE resolves template variables through probing before the conversation begins. Readiness is tracked via update_assistant_readiness() in vibe/assistant/extension_support.py.

3.2 Conversation Loop

User message -> Service stores -> LLM processes history + tools -> Tool calls
    |                                                                   |
User answers <- Browser renders <- SSE events <- StreamTranslator translates

Key Insight: LLM constructs the interview UI dynamically by calling tools that render widgets.

3.3 Streaming with SSE

Location: vibe/web/sse.py (canonical SSE formatting)

Pipeline: LLM Provider -> StreamChunk objects -> StreamTranslator -> SSE events -> Browser

Chunk Types: TEXT, TOOL_CALL_START, TOOL_ARGUMENT_CHUNK, TOOL_ARGUMENTS_COMPLETE, TOOL_CALL_END

3.4 Session State

Everything stored in Flask session under session["assistants"][assistant_name]:

  • history -- Full conversation (messages + tool calls)
  • draft_blocks -- Document sections
  • tool_call_ids -- Widget ID to tool call ID mapping
  • pending_tool_outputs -- Queued tool results

4. MESSAGE TYPE SYSTEM

4.1 Message Classes

Location: vibe/assistant/structures.py

Standardized message format, converted to/from provider-specific formats at boundaries.

  • SystemMessage, UserMessage (supports auto-reply flag), AssistantMessage (with optional tool calls), ToolResult
  • Tool, ToolParameter (with JSON Schema validation), ToolCall

4.2 Message Conversion (Visitor Pattern)

Location: vibe/providers/llm/message_converter.py

MessageConverter base class uses @singledispatchmethod for type-based dispatch. Each provider defines its own converter. See llm-providers.md Section 4.

4.3 Serialization

message_to_dict() and message_from_dict() for session storage.

5. TOOL SYSTEM

Location: Tool definitions in vibe/providers/llm/tools.py, handlers in vibe/assistant/services/stream_translator.py

Available Tools:

  • ask_question -- Unified interface for all question types (predefined or generated)
  • insert_blocks -- Insert blocks at position (default: append at end)
  • upsert_block_contents -- Create or update single block
  • delete_blocks -- Remove blocks
  • clear_all_blocks -- Remove all blocks (for full rewrite)
  • finalize -- Complete interview

ask_question Modes:

  • predefined: Uses template question definitions (requires question_id)
  • generated: Creates ad-hoc questions (requires question_text, optional type)

Tool Call Lifecycle:

  1. LLM calls tool -> Provider yields chunks (START, ARGUMENT_CHUNK, ARGUMENTS_COMPLETE, END)
  2. StreamTranslator processes complete arguments -> Renders widget via handler
  3. Yields SSE form_command -> Browser displays widget
  4. User submits -> Service maps widget ID to tool call ID -> Creates ToolResult
  5. ToolResult added to conversation history for next LLM turn

Tool Call ID Mapping: Session stores bidirectional mapping between widget IDs and tool call IDs.

5.1 JSON Action Fallback

When an endpoint is configured with tools: false, models output JSON actions in code blocks:

{"action": "ASK_QUESTION", "mode": "generated", "question_text": "What is your name?"}

The streaming JSON parser extracts these and converts them to standard StreamChunk tool call events, so the rest of the pipeline works identically.

6. STREAMING ARCHITECTURE

Location: vibe/assistant/services/stream_translator.py

StreamTranslator Responsibilities:

  1. Process tool calls from chunks
  2. Render UI widgets (HTML generation)
  3. Manage streaming state (current block, question count)
  4. Coordinate streaming targets (chat bubble vs draft blocks)

SSE Event Types:

Event Purpose
chunk Stream text content to current target
form_command Render/update question widget
form_start Initial form structure
block_command Draft block operations
set_stream_target Redirect subsequent chunks to element
assistant_bubble Create assistant message bubble
thinking Display model reasoning
finalize Enable finalize button
close End stream
error Display error message
assistant_status Status updates (tft, retrying, adjusting)

Key Design: Each question generates exactly ONE form_command event to prevent form corruption. set_stream_target events redirect the streaming cursor to different DOM locations (chat or draft blocks).

7. WEB INTEGRATION

Location: vibe/assistant/web/assistant.py

7.1 Routes

Route Method Purpose
/assistant/workbench/<template_id>/<assistant_name> GET Render initial workbench UI
/assistant/turn/<template_id>/<assistant_name> POST Process user input, return SSE connector
/assistant/stream/<template_id>/<assistant_name>/<turn_id> GET Stream LLM response via SSE
/assistant/finalize/<template_id>/<assistant_name> POST Finalize draft
/assistant/dev-settings/<template_id>/<assistant_name> POST Update dev settings
/assistant/dev/logfiles GET List log files
/assistant/dev/sessions/ GET List recorded sessions
/assistant/dev/<endpoint>/settings GET/POST Endpoint settings

Assisted Text Blueprint: vibe/assistant/web/assisted_text.py provides /assisted/generate, /assisted/stream/<stream_id>, /assisted/refine_modal for AI-assisted text fields.

7.2 Turn Processing Flow

  1. User submits form -> POST to /assistant/turn
  2. Handler calls service.process_user_input(form) -> Returns AssistantTurn
  3. Renders SSE connector template -> HTMX establishes SSE connection
  4. GET to /assistant/stream starts streaming via prepare_streaming_response(turn)
  5. Browser receives SSE events and updates UI

8. DATA FLOW DIAGRAMS

8.1 User Message -> LLM Response

User submits form
    -> POST /assistant/turn -> service.process_user_input(form)
    -> Return AssistantTurn, render SSE connector

Browser establishes SSE connection
    -> GET /assistant/stream -> service.prepare_streaming_response(turn)
    -> prepare_provider(turn) via ProviderFactory
       -> If interview_mode == "assistant": Wrap with SystemProxyProvider
    -> build_messages(turn, provider) via ModelContextManager
    -> start_streaming(turn, provider, messages)
    -> StreamTranslator translates chunks to SSE events
    -> Browser updates UI (chat bubble, widgets, draft blocks)
    -> Stream complete -> Save session

8.2 Tool Call Processing

LLM calls tool -> Provider yields: START -> ARGUMENT_CHUNK -> ARGUMENTS_COMPLETE -> END
    -> StreamTranslator renders widget HTML on ARGUMENTS_COMPLETE
    -> Yield form_command SSE event -> Browser injects widget
    -> User fills and submits
    -> service.process_user_input(form_data) maps field to tool_call_id
    -> Creates ToolResult -> Added to history
    -> Next LLM turn includes ToolResult

9. CONFIGURATION

Location: vibe/assistant/config.py

AssistantConfig: sessions_base_path, assistant_endpoints, default_timeout AssistantConfigFactory: from_flask_app (production), for_testing(...) (testing with mock dependencies)

Flow: config.yml -> load_configuration() -> Flask app.config -> app.LLM_ENDPOINTS -> AssistantConfig -> AssistantService

10. FILE LOCATION INDEX

Core Service Layer

What Where
Service facade vibe/assistant/services/assistant_service.py
ConversationStateStore vibe/assistant/services/conversation_state_store.py
ProviderFactory vibe/assistant/services/provider_factory.py
DraftManager vibe/assistant/services/draft_manager.py
TurnOrchestrator vibe/assistant/services/turn_orchestrator.py
QuestionSessionState vibe/assistant/services/question_session_state.py
ReferenceDocuments vibe/assistant/services/reference_documents.py
VectorStoreManager vibe/assistant/services/vector_store_manager.py
Config vibe/assistant/config.py

Context & Streaming

What Where
ModelContextManager vibe/assistant/context.py
HistoryPruner vibe/assistant/history_pruner.py
SystemPromptBuilder vibe/assistant/system_prompt_builder.py
StreamTranslator vibe/assistant/services/stream_translator.py
SSE utilities vibe/web/sse.py

Messages & Tools

What Where
Message types vibe/assistant/structures.py
Tool definitions vibe/providers/llm/tools.py
Message converter vibe/providers/llm/message_converter.py

Web Layer

What Where
Assistant routes vibe/assistant/web/assistant.py
Assisted text routes vibe/assistant/web/assisted_text.py
Extension support vibe/assistant/extension_support.py
Templates vibe/assistant/templates/assistant/workbench.html
Stages vibe/assistant/assistant_stages.py
Logging vibe/assistant/logging.py

Document Version: 5.1 Last Updated: 2026-03-22 Notes: Updated file paths for package reorganization: vibe/sse.py -> vibe/web/sse.py, corrected assistant templates path