AI Assistant Architecture¶
Navigation guide for VIBE's AI-assisted interview system. Points you to the right source files for understanding implementation details.
See also:
core.md- Core VIBE enginecomponents.md- Component systemfrontend.md- Frontend UI patternsllm-providers.md- LLM provider abstraction (base classes, available providers, replay system)review.md- Document compliance review
1. SYSTEM OVERVIEW¶
The Assistant feature enables LLM-powered document generation through conversational interviews. LLMs use tools to ask questions, collect information, and incrementally build draft documents.
Core Architecture:
- Service-Oriented: Decomposed into focused collaborators (see Section 2)
- Provider Abstraction: Multiple LLM backends via common interface (see
llm-providers.md) - Streaming-First: Real-time UI updates via Server-Sent Events (SSE)
- Tool-Based: LLM controls interview via function calls
Interview Modes:
- standard -- Traditional form-first (no assistant)
- assistant -- Assistant drives entire interview
Web Routes (HTTP/SSE) -> AssistantService (facade) -> LLM Providers -> External APIs
| |
Service Components StreamTranslator -> SSE Events -> Browser
|
Session Storage
2. SERVICE DECOMPOSITION¶
2.1 Service Architecture¶
Location: vibe/assistant/services/
AssistantService (facade - thin coordinator)
|
+-- ConversationStateStore (session read/write, history serialization)
+-- ProviderFactory (endpoint resolution, dev overrides, playback)
+-- DraftManager (block IDs, storage, draft rendering)
+-- TurnOrchestrator (retry logic, auto-reply, tool call validation)
+-- QuestionSessionState (question state tracking across components)
2.2 Service Files¶
| Component | Location |
|---|---|
| AssistantService (facade) | vibe/assistant/services/assistant_service.py |
| ConversationStateStore | vibe/assistant/services/conversation_state_store.py |
| ProviderFactory | vibe/assistant/services/provider_factory.py |
| DraftManager | vibe/assistant/services/draft_manager.py |
| TurnOrchestrator | vibe/assistant/services/turn_orchestrator.py |
| QuestionSessionState | vibe/assistant/services/question_session_state.py |
| SessionAccessor | vibe/assistant/services/session_accessor.py |
| StreamTranslator | vibe/assistant/services/stream_translator.py |
| ReferenceDocuments | vibe/assistant/services/reference_documents.py |
| VectorStoreManager | vibe/assistant/services/vector_store_manager.py |
2.3 ModelContextManager Components¶
Location: vibe/assistant/
| Component | Location | Purpose |
|---|---|---|
| HistoryPruner | vibe/assistant/history_pruner.py |
Prune history for checkpoint/consolidated strategies |
| UserInputProcessor | vibe/assistant/user_input_processor.py |
Security markers, input transformation |
| SystemPromptBuilder | vibe/assistant/system_prompt_builder.py |
Build system prompts from templates |
| ModelContextManager | vibe/assistant/context.py |
Orchestrates above components |
3. HOW IT WORKS¶
3.1 Initialization¶
Each assistant conversation begins with a system prompt combined with an initial user prompt from {% assistant %}...{% endassistant %} tags in the template. VIBE resolves template variables through probing before the conversation begins. Readiness is tracked via update_assistant_readiness() in vibe/assistant/extension_support.py.
3.2 Conversation Loop¶
User message -> Service stores -> LLM processes history + tools -> Tool calls
| |
User answers <- Browser renders <- SSE events <- StreamTranslator translates
Key Insight: LLM constructs the interview UI dynamically by calling tools that render widgets.
3.3 Streaming with SSE¶
Location: vibe/web/sse.py (canonical SSE formatting)
Pipeline: LLM Provider -> StreamChunk objects -> StreamTranslator -> SSE events -> Browser
Chunk Types: TEXT, TOOL_CALL_START, TOOL_ARGUMENT_CHUNK, TOOL_ARGUMENTS_COMPLETE, TOOL_CALL_END
3.4 Session State¶
Everything stored in Flask session under session["assistants"][assistant_name]:
history-- Full conversation (messages + tool calls)draft_blocks-- Document sectionstool_call_ids-- Widget ID to tool call ID mappingpending_tool_outputs-- Queued tool results
4. MESSAGE TYPE SYSTEM¶
4.1 Message Classes¶
Location: vibe/assistant/structures.py
Standardized message format, converted to/from provider-specific formats at boundaries.
SystemMessage,UserMessage(supports auto-reply flag),AssistantMessage(with optional tool calls),ToolResultTool,ToolParameter(with JSON Schema validation),ToolCall
4.2 Message Conversion (Visitor Pattern)¶
Location: vibe/providers/llm/message_converter.py
MessageConverter base class uses @singledispatchmethod for type-based dispatch. Each provider defines its own converter. See llm-providers.md Section 4.
4.3 Serialization¶
message_to_dict() and message_from_dict() for session storage.
5. TOOL SYSTEM¶
Location: Tool definitions in vibe/providers/llm/tools.py, handlers in vibe/assistant/services/stream_translator.py
Available Tools:
ask_question-- Unified interface for all question types (predefined or generated)insert_blocks-- Insert blocks at position (default: append at end)upsert_block_contents-- Create or update single blockdelete_blocks-- Remove blocksclear_all_blocks-- Remove all blocks (for full rewrite)finalize-- Complete interview
ask_question Modes:
- predefined: Uses template question definitions (requires
question_id) - generated: Creates ad-hoc questions (requires
question_text, optionaltype)
Tool Call Lifecycle:
- LLM calls tool -> Provider yields chunks (START, ARGUMENT_CHUNK, ARGUMENTS_COMPLETE, END)
- StreamTranslator processes complete arguments -> Renders widget via handler
- Yields SSE
form_command-> Browser displays widget - User submits -> Service maps widget ID to tool call ID -> Creates ToolResult
- ToolResult added to conversation history for next LLM turn
Tool Call ID Mapping: Session stores bidirectional mapping between widget IDs and tool call IDs.
5.1 JSON Action Fallback¶
When an endpoint is configured with tools: false, models output JSON actions in code blocks:
The streaming JSON parser extracts these and converts them to standard StreamChunk tool call events, so the rest of the pipeline works identically.
6. STREAMING ARCHITECTURE¶
Location: vibe/assistant/services/stream_translator.py
StreamTranslator Responsibilities:
- Process tool calls from chunks
- Render UI widgets (HTML generation)
- Manage streaming state (current block, question count)
- Coordinate streaming targets (chat bubble vs draft blocks)
SSE Event Types:
| Event | Purpose |
|---|---|
chunk |
Stream text content to current target |
form_command |
Render/update question widget |
form_start |
Initial form structure |
block_command |
Draft block operations |
set_stream_target |
Redirect subsequent chunks to element |
assistant_bubble |
Create assistant message bubble |
thinking |
Display model reasoning |
finalize |
Enable finalize button |
close |
End stream |
error |
Display error message |
assistant_status |
Status updates (tft, retrying, adjusting) |
Key Design: Each question generates exactly ONE form_command event to prevent form corruption. set_stream_target events redirect the streaming cursor to different DOM locations (chat or draft blocks).
7. WEB INTEGRATION¶
Location: vibe/assistant/web/assistant.py
7.1 Routes¶
| Route | Method | Purpose |
|---|---|---|
/assistant/workbench/<template_id>/<assistant_name> |
GET | Render initial workbench UI |
/assistant/turn/<template_id>/<assistant_name> |
POST | Process user input, return SSE connector |
/assistant/stream/<template_id>/<assistant_name>/<turn_id> |
GET | Stream LLM response via SSE |
/assistant/finalize/<template_id>/<assistant_name> |
POST | Finalize draft |
/assistant/dev-settings/<template_id>/<assistant_name> |
POST | Update dev settings |
/assistant/dev/logfiles |
GET | List log files |
/assistant/dev/sessions/ |
GET | List recorded sessions |
/assistant/dev/<endpoint>/settings |
GET/POST | Endpoint settings |
Assisted Text Blueprint: vibe/assistant/web/assisted_text.py provides /assisted/generate, /assisted/stream/<stream_id>, /assisted/refine_modal for AI-assisted text fields.
7.2 Turn Processing Flow¶
- User submits form -> POST to
/assistant/turn - Handler calls
service.process_user_input(form)-> ReturnsAssistantTurn - Renders SSE connector template -> HTMX establishes SSE connection
- GET to
/assistant/streamstarts streaming viaprepare_streaming_response(turn) - Browser receives SSE events and updates UI
8. DATA FLOW DIAGRAMS¶
8.1 User Message -> LLM Response¶
User submits form
-> POST /assistant/turn -> service.process_user_input(form)
-> Return AssistantTurn, render SSE connector
Browser establishes SSE connection
-> GET /assistant/stream -> service.prepare_streaming_response(turn)
-> prepare_provider(turn) via ProviderFactory
-> If interview_mode == "assistant": Wrap with SystemProxyProvider
-> build_messages(turn, provider) via ModelContextManager
-> start_streaming(turn, provider, messages)
-> StreamTranslator translates chunks to SSE events
-> Browser updates UI (chat bubble, widgets, draft blocks)
-> Stream complete -> Save session
8.2 Tool Call Processing¶
LLM calls tool -> Provider yields: START -> ARGUMENT_CHUNK -> ARGUMENTS_COMPLETE -> END
-> StreamTranslator renders widget HTML on ARGUMENTS_COMPLETE
-> Yield form_command SSE event -> Browser injects widget
-> User fills and submits
-> service.process_user_input(form_data) maps field to tool_call_id
-> Creates ToolResult -> Added to history
-> Next LLM turn includes ToolResult
9. CONFIGURATION¶
Location: vibe/assistant/config.py
AssistantConfig: sessions_base_path, assistant_endpoints, default_timeout
AssistantConfigFactory: from_flask_app (production), for_testing(...) (testing with mock dependencies)
Flow: config.yml -> load_configuration() -> Flask app.config -> app.LLM_ENDPOINTS -> AssistantConfig -> AssistantService
10. FILE LOCATION INDEX¶
Core Service Layer¶
| What | Where |
|---|---|
| Service facade | vibe/assistant/services/assistant_service.py |
| ConversationStateStore | vibe/assistant/services/conversation_state_store.py |
| ProviderFactory | vibe/assistant/services/provider_factory.py |
| DraftManager | vibe/assistant/services/draft_manager.py |
| TurnOrchestrator | vibe/assistant/services/turn_orchestrator.py |
| QuestionSessionState | vibe/assistant/services/question_session_state.py |
| ReferenceDocuments | vibe/assistant/services/reference_documents.py |
| VectorStoreManager | vibe/assistant/services/vector_store_manager.py |
| Config | vibe/assistant/config.py |
Context & Streaming¶
| What | Where |
|---|---|
| ModelContextManager | vibe/assistant/context.py |
| HistoryPruner | vibe/assistant/history_pruner.py |
| SystemPromptBuilder | vibe/assistant/system_prompt_builder.py |
| StreamTranslator | vibe/assistant/services/stream_translator.py |
| SSE utilities | vibe/web/sse.py |
Messages & Tools¶
| What | Where |
|---|---|
| Message types | vibe/assistant/structures.py |
| Tool definitions | vibe/providers/llm/tools.py |
| Message converter | vibe/providers/llm/message_converter.py |
Web Layer¶
| What | Where |
|---|---|
| Assistant routes | vibe/assistant/web/assistant.py |
| Assisted text routes | vibe/assistant/web/assisted_text.py |
| Extension support | vibe/assistant/extension_support.py |
| Templates | vibe/assistant/templates/assistant/workbench.html |
| Stages | vibe/assistant/assistant_stages.py |
| Logging | vibe/assistant/logging.py |
Document Version: 5.1 Last Updated: 2026-03-22 Notes: Updated file paths for package reorganization: vibe/sse.py -> vibe/web/sse.py, corrected assistant templates path