Authoring with AI: A Guide to VIBE Assistants

Extension note: Assistants are part of the optional Assistant extension. You need configured LLM endpoints in the system config.yml for this guide to apply.

This guide explains how to integrate interactive Large Language Model (LLM) assistants into your VIBE templates. Assistants can help end-users in three primary ways, from providing a suggestion for a single field to driving the entire interview conversationally.

We will cover the three levels of AI integration:

  1. AI-Assisted Answers: Using type: assisted to get AI help for a single question.
  2. AI-Assisted Sections: Using the {% assistant %} block to collaboratively draft a whole section of a document.
  3. AI-Driven Interviews: Using interview_mode: assistant to make the entire interview a conversation.

Level 1: AI-Assisted Answers

The simplest way to use AI is to help the user answer a single, specific question. This is perfect for fields that require a bit of creative or technical text that the user might not know how to write offhand.

Use Case: Suggesting a creative project code name.

Feature: The assisted question type.

How to Implement

Define a question with type: assisted in your config.yml. You must provide a prompt that tells the AI what to generate. You can even use other variables to give the AI context.

# In config.yml
questions:
  project_type:
    type: select
    label: "What type of project is this?"
    options:
      - "Internal tool"
      - "Client-facing website"
      - "Mobile application"

  project_codename:
    type: assisted
    label: "Project Codename"
    prompt: |
      Suggest a cool, one-word project codename for a new {{ project_type }}.
      The codename should be evocative and professional.
      Just output the name itself, with no extra text.

How It Works for the User

  1. The user first answers the project_type question (e.g., "Mobile application").
  2. The project_codename question appears as a text field with a "✨ Suggest" button next to it.
  3. When the user clicks the button, the AI receives the prompt ("Suggest a cool... codename for a new Mobile application.") and generates a suggestion (e.g., "Odyssey"), which is then filled into the text field.
  4. The user can accept, edit, or re-generate the suggestion.

This provides a small, targeted burst of AI assistance without interrupting the standard form-filling flow.


Level 2: AI-Assisted Sections

For more complex parts of a document, like a scope of work, a legal clause, or technical requirements, you can use an AI assistant to collaboratively draft an entire section.

Use Case: Drafting the technical requirements section of an RFI.

Feature: The {% assistant %} block.

In the default interview_mode: standard, the assistant lives in its own tab in the UI. The user fills out the main form, and when they are ready, they can switch to the assistant tab to work on that specific section.

How to Implement

This is a three-step process:

1. Configure the Assistant: Define the assistant's identity in your config.yml.

# In config.yml
assistants:
  requirements_drafter: # A unique name for your assistant
    label: "AI: Tech Requirements"
    model: claude-opus # Optional: Choose a specific model

2. Place the Assistant: Use the {% assistant %} block in template.md to define where the AI-generated text will go. The content of the block is the initial prompt for the AI.

# In template.md
## Section 4: Technical Requirements

{% assistant 'requirements_drafter' %}
You are an expert procurement assistant. Your task is to draft a detailed technical requirements section for a Request for Information (RFI).

The service being procured is: **{{ service_name }}**.

Start with a high-level summary, then ask the user clarifying questions to build out the section.
{% endassistant %}

VIBE will ensure the service_name question is answered before the assistant can be activated.

3. Provide Predefined "Tools": The assistant can ask its own questions, but you can provide high-quality, structured questions for it to use. The assistant automatically has access to all questions defined in your main questions: block.

# In config.yml
questions:
  service_name:
    type: text
    label: What is the name of the service being procured?

  # A "tool" the assistant can use
  data_residency:
    type: radio
    label: "Data Residency Requirement"
    options: ["North America", "European Union", "Any"]

How It Works for the User

  1. The user fills out the standard interview form, including the service_name question.
  2. A tab labeled "AI: Tech Requirements" appears in the preview area.
  3. The user clicks the tab, and the AI begins drafting based on its prompt.
  4. To get more detail, the AI can ask questions. If it needs to know about data residency, it will find and use the rich radio button question you defined.
  5. The user and AI collaborate on the draft. When finished, the generated text is inserted into the final document.

Level 3: AI-Driven Interviews

The final level of integration is to make the assistant the star of the show. For templates where the primary goal is collaborative drafting, you can make the entire interview a conversation.

Use Case: A template for drafting a creative story, where the entire process is a back-and-forth with an AI muse.

Feature: interview_mode: assistant

How to Implement

This is a single configuration change in config.yml.

# In config.yml
interview_mode: assistant

assistants:
  # You must define exactly one assistant in this mode
  story_writer:
    label: "AI Storyteller"

questions:
  # The assistant will ask these conversationally at the start
  story_genre:
    type: text
    label: "What genre is your story?"
  main_character:
    type: text
    label: "Describe your main character."

How It Works for the User

The experience is completely different from a standard VIBE interview:

  1. The application loads directly into a chat interface. There is no form.
  2. The assistant starts the conversation by asking the first question it needs for its prompt (story_genre).
  3. The user answers in the chat. The assistant then asks the next question (main_character).
  4. Once all the initial context questions are answered, the assistant begins its primary drafting task, using the answers to inform its work.
  5. The rest of the interaction (drafting, asking more questions) happens entirely within the chat.
  6. Even in this mode, you can still have sections of your template.md that use traditional Jinja templating with the variables collected during the conversation. The {% assistant %} block handles the AI-drafted portion, while the rest of the template is rendered as usual.

Customizing Assistant Messages

You can customize the messages displayed by the assistant at various stages of the conversation. These options are configured per-assistant in your config.yml.

Message Configuration Options

Option Description Default
welcome_message Displayed when the assistant conversation starts (none)
review_message Displayed when the document is ready for review "The document is ready for review. Please request any adjustments below."
followup_message Label for the follow-up instruction textarea "Follow-up instruction:"
followup_hint Placeholder text in the follow-up textarea "Eg. \"Rewrite with less bullet lists\""

Example Configuration

assistants:
  requirements_drafter:
    label: "AI: Requirements"
    endpoint: openai-gpt4o
    welcome_message: |
      Hello! I'll help you draft the requirements section.
      Let me start by asking a few questions.
    review_message: |
      The draft is ready for your review. You can download it now,
      or request changes below.
    followup_message: "What changes would you like?"
    followup_hint: "E.g., add more detail about security requirements"

When Messages Appear

  1. welcome_message: Shown as the first assistant bubble when the conversation begins. Use this to set expectations or provide initial guidance.

  2. review_message: Shown when the assistant calls the finalize tool, indicating the document is complete. The download button appears alongside this message.

  3. followup_message and followup_hint: Used for the text input that appears after draft modifications, allowing the user to request additional changes.

All message options support multi-line text using YAML's | syntax. If not specified, localized defaults are used (available in English, Swedish, French, German, and Spanish).


Developer Tools for Debugging and Analysis

When developing and debugging assistant integrations, VIBE provides several tools to help you understand how the LLM is behaving and troubleshoot issues.

Endpoint Selector (Development Mode)

In development mode, the interview UI includes an endpoint selector dropdown that lets you switch between different LLM endpoints on the fly. This is useful for:

  • Comparing responses from different models (e.g., GPT-4 vs Claude)
  • Testing fallback behavior
  • Debugging provider-specific issues

The selector appears in the assistant interface and affects all subsequent requests in that session.

Session Replay

The vibe-dev assistant replay command lets you replay recorded LLM sessions. This is invaluable for:

  • Reproducing issues without making new API calls
  • Understanding the full request/response cycle
  • Creating test fixtures
# Replay all sessions from a log file
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl

# Replay a specific session
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl --session abc-123

# Replay a specific turn in a session
vibe-dev assistant replay logs/assistant/llm_20250101.jsonl --session abc-123 --sequence 2

Analyzing LLM Behavior with filter-log and summarize

VIBE logs all assistant interactions in JSONL format. Two complementary tools help you analyze these logs:

filter-log - Filter log entries by any top-level key using substring matching:

# Filter by session ID
vibe-dev assistant filter-log session_id=abc-123

# Only response entries (not requests)
vibe-dev assistant filter-log type=response

# Filter by endpoint/model
vibe-dev assistant filter-log endpoint_name=openai

# Combine multiple filters (AND logic)
vibe-dev assistant filter-log type=response session_id=abc timestamp=2025-11-26T20:43

summarize - Display formatted summaries of requests and responses:

# Summarize today's log
vibe-dev assistant summarize

# Summarize a specific log file
vibe-dev assistant summarize logs/assistant/llm_20250101.jsonl

# Filter to a specific session
vibe-dev assistant summarize session_id=abc-123

Pipeline usage - Combine these tools for targeted analysis:

# Summarize only response entries from OpenAI endpoints
vibe-dev assistant filter-log type=response endpoint_name=openai | vibe-dev assistant summarize -

# Analyze all Claude responses from a specific session
vibe-dev assistant filter-log type=response endpoint_name=claude session_id=abc | vibe-dev assistant summarize -

This pipeline approach is particularly useful for:

  • Debugging unexpected responses: Filter to specific sessions or timestamps to isolate problematic interactions
  • Comparing provider behavior: Filter by endpoint to see how different models respond to the same prompts
  • Performance analysis: Examine response times and token usage patterns
  • Prompt engineering: Review the actual prompts being sent and how the LLM interprets them

Filtering Logs for Replay

The replay command works with log files, not stdin. To replay a filtered subset of interactions, redirect the filter output to a file in the logs directory, then replay that file:

# Filter a specific session to a new log file
vibe-dev assistant filter-log session_id=abc-123 > .vibe_data/logs/assistant/filtered.jsonl

# Replay the filtered log
vibe-dev assistant replay .vibe_data/logs/assistant/filtered.jsonl

This is useful for:

  • Isolating problematic sessions: Extract just the failing session for focused debugging
  • Creating minimal reproductions: Filter to only the relevant entries before sharing with teammates
  • Testing with specific providers: Filter by endpoint_name to replay only interactions with a particular model

You can also use filtered log files with the UI replay functionality by placing them in the logs directory.

Log File Location

Assistant logs are stored in .vibe_data/logs/assistant/ with filenames like llm_YYYYMMDD.jsonl. If no log file is specified, tools default to today's log.


Reference Documents (File Search Grounding)

Reference documents allow assistants to answer questions based on the content of uploaded PDF files. Documents are uploaded to OpenAI vector stores and made available via the file_search tool.

Quick Start

Add reference_documents to your assistant configuration in config.yml:

assistants:
  my_assistant:
    endpoint: openai-gpt4o
    reference_documents:
      directory: reference_docs  # All PDFs in this directory
      purpose: [facts]           # How to use the documents

Configuration Options

Directory Mode

Upload all PDFs from a directory:

reference_documents:
  directory: reference_docs
  purpose: [facts, style]
  grounding_instructions: "Use these documents for pricing information."

Explicit Files Mode

Specify individual files with custom labels:

reference_documents:
  files:
    - path: docs/pricing.pdf
      label: "Pricing Guide"
      purpose: [facts]
    - path: docs/style.pdf
      label: "Style Reference"
      purpose: [style]

Purpose Tags

Purpose tags affect how the system prompt instructs the model to use the documents:

Purpose Behavior
facts Use as primary source for factual questions. Don't hallucinate.
style Match tone and structure from documents.
reasoning Use for interpretation, summarization, and comparison.

Default purpose is facts if not specified.

Custom Grounding Instructions

Add additional instructions for how the model should use the documents:

reference_documents:
  directory: reference_docs
  grounding_instructions: |
    Always cite article numbers when referencing regulations.
    Use Swedish terminology from the documents.

Vector Store CLI Commands

Manage cached vector stores using the CLI:

# Show cache info
python app.py vector-store info

# List cached vector stores
python app.py vector-store list
python app.py vs list -v  # Verbose mode

# List stores for a specific template
python app.py vs list --template my_template

# Clear cache (forces re-upload on next run)
python app.py vector-store clear
python app.py vs clear --template my_template  # Clear specific template only

Caching

Vector stores are cached to avoid redundant uploads:

  1. Cache Key: SHA-256 hash of template_id + file contents
  2. Two-Level Cache: In-memory dict (fast lookups) + persistent JSON files (survives restarts)
  3. Automatic Invalidation: When file contents change, a new vector store is created

Cache files are stored in .vibe_data/cache/vector_stores/.

Provider Requirements

The file_search feature requires:

  • OpenAI provider with Responses API enabled
  • Model that supports file_search (gpt-4o, gpt-4o-mini, etc.)

Other providers will receive the grounding instructions in the system prompt but won't have access to the file_search tool.