Authoring with AI: A Guide to VIBE Assistants¶
Extension note: Assistants are part of the optional Assistant extension. You need configured LLM endpoints in the system config.yml for this guide to apply.
This guide explains how to integrate interactive Large Language Model (LLM) assistants into your VIBE templates. Assistants can help end-users in three primary ways, from providing a suggestion for a single field to driving the entire interview conversationally.
We will cover the three levels of AI integration:
- AI-Assisted Answers: Using
type: assistedto get AI help for a single question. - AI-Assisted Sections: Using the
{% assistant %}block to collaboratively draft a whole section of a document. - AI-Driven Interviews: Using
interview_mode: assistantto make the entire interview a conversation.
Level 1: AI-Assisted Answers¶
The simplest way to use AI is to help the user answer a single, specific question. This is perfect for fields that require a bit of creative or technical text that the user might not know how to write offhand.
Use Case: Suggesting a creative project code name.
Feature: The assisted question type.
How to Implement¶
Define a question with type: assisted in your config.yml. You must provide a prompt that tells the AI what to generate. You can even use other variables to give the AI context.
# In config.yml
questions:
project_type:
type: select
label: "What type of project is this?"
options:
- "Internal tool"
- "Client-facing website"
- "Mobile application"
project_codename:
type: assisted
label: "Project Codename"
prompt: |
Suggest a cool, one-word project codename for a new {{ project_type }}.
The codename should be evocative and professional.
Just output the name itself, with no extra text.
How It Works for the User¶
- The user first answers the
project_typequestion (e.g., "Mobile application"). - The
project_codenamequestion appears as a text field with a "✨ Suggest" button next to it. - When the user clicks the button, the AI receives the prompt ("Suggest a cool... codename for a new Mobile application.") and generates a suggestion (e.g., "Odyssey"), which is then filled into the text field.
- The user can accept, edit, or re-generate the suggestion.
This provides a small, targeted burst of AI assistance without interrupting the standard form-filling flow.
Level 2: AI-Assisted Sections¶
For more complex parts of a document, like a scope of work, a legal clause, or technical requirements, you can use an AI assistant to collaboratively draft an entire section.
Use Case: Drafting the technical requirements section of an RFI.
Feature: The {% assistant %} block.
In the default interview_mode: standard, the assistant lives in its own tab in the UI. The user fills out the main form, and when they are ready, they can switch to the assistant tab to work on that specific section.
How to Implement¶
This is a three-step process:
1. Configure the Assistant: Define the assistant's identity in your config.yml.
# In config.yml
assistants:
requirements_drafter: # A unique name for your assistant
label: "AI: Tech Requirements"
model: claude-opus # Optional: Choose a specific model
2. Place the Assistant: Use the {% assistant %} block in template.md to define where the AI-generated text will go. The content of the block is the initial prompt for the AI.
# In template.md
## Section 4: Technical Requirements
{% assistant 'requirements_drafter' %}
You are an expert procurement assistant. Your task is to draft a detailed technical requirements section for a Request for Information (RFI).
The service being procured is: **{{ service_name }}**.
Start with a high-level summary, then ask the user clarifying questions to build out the section.
{% endassistant %}
VIBE will ensure the service_name question is answered before the assistant can be activated.
3. Provide Predefined "Tools": The assistant can ask its own questions, but you can provide high-quality, structured questions for it to use. The assistant automatically has access to all questions defined in your main questions: block.
# In config.yml
questions:
service_name:
type: text
label: What is the name of the service being procured?
# A "tool" the assistant can use
data_residency:
type: radio
label: "Data Residency Requirement"
options: ["North America", "European Union", "Any"]
How It Works for the User¶
- The user fills out the standard interview form, including the
service_namequestion. - A tab labeled "AI: Tech Requirements" appears in the preview area.
- The user clicks the tab, and the AI begins drafting based on its prompt.
- To get more detail, the AI can ask questions. If it needs to know about data residency, it will find and use the rich
radiobutton question you defined. - The user and AI collaborate on the draft. When finished, the generated text is inserted into the final document.
Level 3: AI-Driven Interviews¶
The final level of integration is to make the assistant the star of the show. For templates where the primary goal is collaborative drafting, you can make the entire interview a conversation.
Use Case: A template for drafting a creative story, where the entire process is a back-and-forth with an AI muse.
Feature: interview_mode: assistant
How to Implement¶
This is a single configuration change in config.yml.
# In config.yml
interview_mode: assistant
assistants:
# You must define exactly one assistant in this mode
story_writer:
label: "AI Storyteller"
questions:
# The assistant will ask these conversationally at the start
story_genre:
type: text
label: "What genre is your story?"
main_character:
type: text
label: "Describe your main character."
How It Works for the User¶
The experience is completely different from a standard VIBE interview:
- The application loads directly into a chat interface. There is no form.
- The assistant starts the conversation by asking the first question it needs for its prompt (
story_genre). - The user answers in the chat. The assistant then asks the next question (
main_character). - Once all the initial context questions are answered, the assistant begins its primary drafting task, using the answers to inform its work.
- The rest of the interaction (drafting, asking more questions) happens entirely within the chat.
- Even in this mode, you can still have sections of your
template.mdthat use traditional Jinja templating with the variables collected during the conversation. The{% assistant %}block handles the AI-drafted portion, while the rest of the template is rendered as usual.
Customizing Assistant Messages¶
You can customize the messages displayed by the assistant at various stages of the conversation. These options are configured per-assistant in your config.yml.
Message Configuration Options¶
| Option | Description | Default |
|---|---|---|
welcome_message |
Displayed when the assistant conversation starts | (none) |
review_message |
Displayed when the document is ready for review | "The document is ready for review. Please request any adjustments below." |
followup_message |
Label for the follow-up instruction textarea | "Follow-up instruction:" |
followup_hint |
Placeholder text in the follow-up textarea | "Eg. \"Rewrite with less bullet lists\"" |
Example Configuration¶
assistants:
requirements_drafter:
label: "AI: Requirements"
endpoint: openai-gpt4o
welcome_message: |
Hello! I'll help you draft the requirements section.
Let me start by asking a few questions.
review_message: |
The draft is ready for your review. You can download it now,
or request changes below.
followup_message: "What changes would you like?"
followup_hint: "E.g., add more detail about security requirements"
When Messages Appear¶
-
welcome_message: Shown as the first assistant bubble when the conversation begins. Use this to set expectations or provide initial guidance. -
review_message: Shown when the assistant calls thefinalizetool, indicating the document is complete. The download button appears alongside this message. -
followup_messageandfollowup_hint: Used for the text input that appears after draft modifications, allowing the user to request additional changes.
All message options support multi-line text using YAML's | syntax. If not specified, localized defaults are used (available in English, Swedish, French, German, and Spanish).
Developer Tools for Debugging and Analysis¶
When developing and debugging assistant integrations, VIBE provides several tools to help you understand how the LLM is behaving and troubleshoot issues.
Endpoint Selector (Development Mode)¶
In development mode, the interview UI includes an endpoint selector dropdown that lets you switch between different LLM endpoints on the fly. This is useful for:
- Comparing responses from different models (e.g., GPT-4 vs Claude)
- Testing fallback behavior
- Debugging provider-specific issues
The selector appears in the assistant interface and affects all subsequent requests in that session.
Session Replay¶
The vibe-dev assistant replay command lets you replay recorded LLM sessions. This is invaluable for:
- Reproducing issues without making new API calls
- Understanding the full request/response cycle
- Creating test fixtures
# Replay all sessions from a log file
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl
# Replay a specific session
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl --session abc-123
# Replay a specific turn in a session
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl --session abc-123 --sequence 2
Analyzing LLM Behavior with filter-log and summarize¶
VIBE logs all assistant interactions in JSONL format. Two complementary tools help you analyze these logs:
filter-log - Filter log entries by any top-level key using substring matching:
# Filter by session ID
vibe-dev assistant filter-log session_id=abc-123
# Only response entries (not requests)
vibe-dev assistant filter-log type=response
# Filter by endpoint/model
vibe-dev assistant filter-log endpoint_name=openai
# Combine multiple filters (AND logic)
vibe-dev assistant filter-log type=response session_id=abc timestamp=2025-11-26T20:43
summarize - Display formatted summaries of requests and responses:
# Summarize today's log
vibe-dev assistant summarize
# Summarize a specific log file
vibe-dev assistant summarize logs/assistant/llm_20250101.jsonl
# Filter to a specific session
vibe-dev assistant summarize session_id=abc-123
Pipeline usage - Combine these tools for targeted analysis:
# Summarize only response entries from OpenAI endpoints
vibe-dev assistant filter-log type=response endpoint_name=openai | vibe-dev assistant summarize -
# Analyze all Claude responses from a specific session
vibe-dev assistant filter-log type=response endpoint_name=claude session_id=abc | vibe-dev assistant summarize -
This pipeline approach is particularly useful for:
- Debugging unexpected responses: Filter to specific sessions or timestamps to isolate problematic interactions
- Comparing provider behavior: Filter by endpoint to see how different models respond to the same prompts
- Performance analysis: Examine response times and token usage patterns
- Prompt engineering: Review the actual prompts being sent and how the LLM interprets them
Filtering Logs for Replay¶
The replay command works with log files, not stdin. To replay a filtered subset of interactions, redirect the filter output to a file in the logs directory, then replay that file:
# Filter a specific session to a new log file
vibe-dev assistant filter-log session_id=abc-123 > .vibe_data/logs/assistant/filtered.jsonl
# Replay the filtered log
vibe-dev assistant replay .vibe_data/logs/assistant/filtered.jsonl
This is useful for:
- Isolating problematic sessions: Extract just the failing session for focused debugging
- Creating minimal reproductions: Filter to only the relevant entries before sharing with teammates
- Testing with specific providers: Filter by
endpoint_nameto replay only interactions with a particular model
You can also use filtered log files with the UI replay functionality by placing them in the logs directory.
Log File Location¶
Assistant logs are stored in .vibe_data/logs/assistant/ with filenames like llm_YYYYMMDD.jsonl. If no log file is specified, tools default to today's log.
Reference Documents (File Search Grounding)¶
Reference documents allow assistants to answer questions based on the content of uploaded PDF files. Documents are uploaded to OpenAI vector stores and made available via the file_search tool.
Quick Start¶
Add reference_documents to your assistant configuration in config.yml:
assistants:
my_assistant:
endpoint: openai-gpt4o
reference_documents:
directory: reference_docs # All PDFs in this directory
purpose: [facts] # How to use the documents
Configuration Options¶
Directory Mode¶
Upload all PDFs from a directory:
reference_documents:
directory: reference_docs
purpose: [facts, style]
grounding_instructions: "Use these documents for pricing information."
Explicit Files Mode¶
Specify individual files with custom labels:
reference_documents:
files:
- path: docs/pricing.pdf
label: "Pricing Guide"
purpose: [facts]
- path: docs/style.pdf
label: "Style Reference"
purpose: [style]
Purpose Tags¶
Purpose tags affect how the system prompt instructs the model to use the documents:
| Purpose | Behavior |
|---|---|
facts |
Use as primary source for factual questions. Don't hallucinate. |
style |
Match tone and structure from documents. |
reasoning |
Use for interpretation, summarization, and comparison. |
Default purpose is facts if not specified.
Custom Grounding Instructions¶
Add additional instructions for how the model should use the documents:
reference_documents:
directory: reference_docs
grounding_instructions: |
Always cite article numbers when referencing regulations.
Use Swedish terminology from the documents.
Vector Store CLI Commands¶
Manage cached vector stores using the CLI:
# Show cache info
python app.py vector-store info
# List cached vector stores
python app.py vector-store list
python app.py vs list -v # Verbose mode
# List stores for a specific template
python app.py vs list --template my_template
# Clear cache (forces re-upload on next run)
python app.py vector-store clear
python app.py vs clear --template my_template # Clear specific template only
Caching¶
Vector stores are cached to avoid redundant uploads:
- Cache Key: SHA-256 hash of template_id + file contents
- Two-Level Cache: In-memory dict (fast lookups) + persistent JSON files (survives restarts)
- Automatic Invalidation: When file contents change, a new vector store is created
Cache files are stored in .vibe_data/cache/vector_stores/.
Provider Requirements¶
The file_search feature requires:
- OpenAI provider with Responses API enabled
- Model that supports file_search (gpt-4o, gpt-4o-mini, etc.)
Other providers will receive the grounding instructions in the system prompt but won't have access to the file_search tool.