Document Review

Extension note: Document Review is an optional extension and requires additional dependencies (install the runtime-review group). Core VIBE runs without it.

VIBE Review extends the document assembly platform to support structured document compliance review. Instead of generating documents through interviews, Review templates evaluate existing documents against a set of requirements.

What is Document Review?

Document Review helps professionals systematically evaluate documents (contracts, policies, assessments) against requirements. Common use cases:

  • DORA Compliance: Review ICT contracts against regulatory requirements
  • GDPR Compliance: Evaluate data processing agreements against Article 28.3 requirements
  • Security Assessments: Check documentation against ISO 27001 or SOC 2 controls
  • Vendor Due Diligence: Verify contracts against internal policies

The system combines:

  1. Human-authored requirements as the source of truth
  2. AI-assisted retrieval to find relevant document parts
  3. LLM evaluation for preliminary classification
  4. Human decision-making for final determinations

Creating a Review Template

Review templates use the standard VIBE template structure with interview_mode: review in the configuration.

Basic Structure

gdpr-dpa-review/
├── config.yml           # Requirements and configuration
├── template.md          # Report template (uses req() to control relevance)
└── components/          # Optional reusable sections

Defining Requirements

Requirements are organized into groups - logical collections of related requirements. The groups key is top-level, with group IDs underneath, and requirements nested within each group:

template_name: "GDPR DPA Compliance Review"
interview_mode: review
description: >
  Review data processing agreements for compliance with GDPR Article 28.3.

# AI provider configuration (optional, references global providers)
review:
  embedding: local      # key from embedding_providers
  reranking: local      # key from rerank_providers
  evaluation: claude    # key from llm_endpoints

# Requirements organized by group
groups:
  instructions:
    title: "Documented Instructions"
    description: "Requirements related to processing instructions"
    requirements:
      GDPR-28-3a:
        label: "Documented instructions"
        description: |
          The processor must only process personal data on documented
          instructions from the controller.
        reference: "gdpr_2016_679:art28.3(a)"
        help: |
          Check that the agreement:
          - Requires processing only on controller's documented instructions
          - Covers transfers outside EU/EEA

          YES: "documented instructions", "written instructions only"
          NO: "processor's discretion", "as processor deems necessary"

      GDPR-28-3b:
        label: "Confidentiality obligations"
        description: |
          Persons authorized to process personal data have committed
          to confidentiality or are under statutory obligation.
        reference:
          - "gdpr_2016_679:art28.3(b)"
          - "gdpr_2016_679:rec83"
        help: |
          Check for confidentiality commitments from personnel.

          YES: "confidentiality agreement", "bound by confidentiality"
          NO: No mention of personnel confidentiality

Requirement Structure

Each requirement has:

Field Required Description
label Yes Short label shown in requirement lists
description Yes Full requirement text explaining what to evaluate
reference No Regulatory source reference(s) - see "Reference Sources" below
reference_text No Full regulatory text excerpt (auto-populated from reference if imported)
help No Evaluation guidance with YES/NO indicators

Context Questions

Review templates support standard VIBE questions alongside requirements. Questions serve two purposes:

  1. Capture document context - Facts about the agreement that affect which requirements apply
  2. AI-assisted answering - The system can analyze the document and suggest answers

Define questions in your config, either at top-level or within groups:

groups:
  transfers:
    title: "International Transfers"
    description: "Requirements for transfers outside EU/EEA"

    questions:
      allows_third_country_transfer:
        type: bool
        label: "Does the agreement permit transfers to third countries?"
        help: |
          Check if the processor is allowed to transfer personal data
          outside the EU/EEA. Look for clauses about subprocessors
          in non-EU countries or data center locations.

      transfer_mechanism:
        type: enum
        label: "What transfer mechanism is specified?"
        options:
          - adequacy: "Adequacy decision (Art. 45)"
          - scc: "Standard Contractual Clauses (Art. 46.2c)"
          - bcr: "Binding Corporate Rules (Art. 47)"
          - derogation: "Derogation (Art. 49)"
          - none: "None specified"
        condition: allows_third_country_transfer

    requirements:
      GDPR-28-transfer-mechanism:
        label: "Valid transfer mechanism"
        description: |
          Transfers to third countries must be based on an adequacy
          decision, appropriate safeguards, or a specific derogation.
        reference: "gdpr_2016_679:art46"

When reviewing, users can click "Suggest Answer" on any question. The AI analyzes the document and proposes an answer with supporting evidence from the text.

Requirement Relevance

Not all requirements apply to every document. Question answers determine which requirements are relevant through the template.

The Mental Model

Requirements become relevant through the template, not through configuration:

  1. Answer questions that capture document context (manually or via AI suggestion)
  2. Template probing executes the template with current answers
  3. Conditional req() calls produce only the relevant requirements
  4. Re-probing on change - when answers change, requirements update automatically

This is the same "template is truth" principle used for interview questions.

Example: GDPR Third Country Transfers

A data processing agreement review might have requirements that only apply when international transfers are involved:

{# template.md - Controls which requirements are relevant #}

# GDPR Article 28 Compliance Review

## Core Processing Requirements

{{ req("GDPR-28-3a") }}  {# Documented instructions - always applies #}
{{ req("GDPR-28-3b") }}  {# Confidentiality - always applies #}

{% if allows_third_country_transfer %}
## International Transfer Requirements

The agreement permits transfers to third countries.

{% if transfer_mechanism == "adequacy" %}
{{ req("GDPR-45-adequacy") }}
{% elif transfer_mechanism == "scc" %}
{{ req("GDPR-46-scc") }}
{{ req("GDPR-46-supplementary") }}
{% elif transfer_mechanism == "bcr" %}
{{ req("GDPR-47-bcr") }}
{% else %}
{{ req("GDPR-49-derogation") }}
{% endif %}

{{ req("GDPR-28-transfer-mechanism") }}
{% endif %}

In this example: - Core requirements (28-3a, 28-3b) always apply - Transfer requirements only appear if allows_third_country_transfer is true - The specific transfer mechanism requirements depend on which legal basis is used - If the reviewer changes an answer, the requirement list updates immediately

How Review Works

1. Document Ingestion

VIBE Review supports Markdown, Word (DOCX), and PDF files.

  • DOCX: Rendered to PDF for layout-faithful viewing.
  • PDF: Processed using high-fidelity text extraction.
  • Scanned PDFs: Automatically processed via OCR (Optical Character Recognition).

The system segments documents into semantically meaningful parts (headings, paragraphs, tables) with stable identifiers.

2. Requirement Matching

For each relevant requirement, the system uses Hybrid Search (combining BM25 keyword matching and semantic vector similarity) to find the most relevant document parts. Results are further refined using a Cross-Encoder Reranker for maximum precision.

3. AI Classification

For each requirement, the AI performs a two-stage evaluation:

  1. Relevance Filtering: Determines if matched parts actually address the requirement.
  2. Compliance Evaluation: Assesses whether the relevant parts satisfy the requirement (YES/NO/PARTIAL/NOT_APPLICABLE).

Each stage produces a confidence level (High/Medium/Low) and reasoning. The UI displays the overall confidence alongside the classification, helping reviewers prioritize which items need closer attention.

Classifications are tracked with metadata:

Field Description
is_ai_suggested Whether the classification came from AI
is_human_verified Whether a human has reviewed/confirmed the classification
confidence Numeric confidence score (0.0-1.0)
reasoning AI explanation for the classification
human_notes Optional notes added by the human reviewer

4. AI-Assisted Questions

Context questions can also be answered by the AI. When you click "Suggest Answer", the system analyzes the document to determine the correct value for your context questions, speeding up the initial assessment phase.

5. Matched Parts Curation

After AI classification, users can refine the matched document parts:

  • Remove irrelevant parts: Click the trash icon next to any matched section to remove it from the assessment.
  • Add missing parts: Hover over any document part in the viewer and click the "+" button to add it to the current assessment item.
  • Curated indicator: When parts have been manually modified, a "(curated)" badge appears next to "Matched Sections".

When you re-run AI classification on an item with curated parts, the system skips the search and reranking stages and uses your curated parts directly. This lets you correct AI mistakes and ensure the classification uses exactly the document sections you've identified.

To reset to automatic part selection, remove all parts and re-run AI classification.

Human-in-the-Loop Workbench

The Review Workbench provides a multi-pane interface for efficient review:

┌──────────────────────────────────────────────────────────┐
│ ┌───────────┐ ┌────────────────────────────────────────┐ │
│ │  Sidebar  │ │           Main Container               │ │
│ │           │ ├──────────────────┬─────────────────────┤ │
│ │ Questions │ │   Detail Panel   │   Document Viewer   │ │
│ │     +     │ │                  │                     │ │
│ │ Req List  │ │  Selected item,  │  Tabs for multiple  │ │
│ │           │ │  AI reasoning,   │  documents, matched │ │
│ │           │ │  classification  │  parts highlighted  │ │
│ └───────────┘ └──────────────────┴─────────────────────┘ │
└──────────────────────────────────────────────────────────┘
  1. Left Sidebar: A unified stream of context questions and compliance requirements. Click any item to select it.
  2. Detail Panel: Shows the selected question or requirement with AI reasoning, matched document sections, and classification controls.
  3. Document Viewer: Displays the document with high-fidelity rendering. Supports multiple documents via tabs. Matched parts are highlighted and clickable to scroll into view.

A draggable divider between the detail panel and document viewer allows you to adjust the column widths.

Template Output

The template.md (or template.docx) defines the final report. The following variables are available in the context:

Variable Description
review_session Metadata about the session (ID, status, created_at, updated_at as ISO timestamps)
req(id) Returns a RequirementProxy with boolean logic and property access (see below)
group(id) Returns a RequirementGroupProxy for group-level compliance checks (see below)
YES, NO, PARTIAL, NOT_APPLICABLE, PENDING Classification constants for comparison
requirements Dictionary of requirement IDs to classification results (legacy, prefer req())
satisfied(id) Function returning true if requirement ID is YES or PARTIAL (legacy, prefer req())
classification(id) Function returning the classification string (legacy, prefer req().result)
documents List of uploaded documents with their metadata
[question_id] Answers to context questions are available at the top level

Requirement Result Fields

Each entry in the requirements dictionary contains:

Field Description
classification Uppercase classification: YES, NO, PARTIAL, NOT_APPLICABLE, or PENDING
confidence Numeric confidence score (0.0-1.0)
reasoning AI or human explanation for the classification
is_ai_suggested Boolean indicating if this came from AI
is_human_verified Boolean indicating if a human confirmed the result
human_notes Optional notes added by the reviewer

Example Report Snippet

## Detailed Findings

{% if satisfied("GDPR-28-3a") %}
- **Documented Instructions**: Found. {{ requirements["GDPR-28-3a"].reasoning }}
{% else %}
- **Documented Instructions**: MISSING. The agreement lacks instructions.
{% endif %}

Requirement and Group Proxies

For more expressive templates, use req() and group() functions that return rich proxy objects with boolean logic and property access.

The req() Function

req('id') returns a RequirementProxy object that:

  • Is truthy when compliant - {% if req('D2-1') %} is true only when result == YES
  • Exposes all classification fields as properties
  • Renders as a formatted string in output context
Property Description
id Requirement ID (e.g., "GDPR-28-3a")
label Short label from config
result Classification result (YES, NO, PARTIAL, NOT_APPLICABLE, PENDING)
confidence Numeric score (0.0-1.0)
reasoning AI or human explanation
human_notes Reviewer's notes
is_ai_suggested True if classification came from AI
is_human_verified True if human confirmed

Boolean semantics example:

{# Only true when result == YES #}
{% if req('GDPR-28-3a') %}
✓ Documented instructions requirement is fully satisfied.
{% endif %}

{# Use 'not' to check for non-compliance #}
{% if not req('GDPR-28-3a') %}
⚠ Missing or incomplete: Add clause for documented instructions.
{% endif %}

Property access example:

{% if req('GDPR-28-3a').result == PARTIAL %}
**Partial compliance**: {{ req('GDPR-28-3a').reasoning }}

{% if req('GDPR-28-3a').human_notes %}
Reviewer notes: {{ req('GDPR-28-3a').human_notes }}
{% endif %}
{% endif %}

The group() Function

group('id') returns a RequirementGroupProxy for group-level compliance checks:

  • Is truthy when all requirements are compliant (all have result == YES)
  • Provides aggregate properties for compliance counts
  • Supports filtering by classification result
Property Description
id Group ID from config
title Group title
requirements List of RequirementProxy objects in the group
all_compliant True if all requirements are YES
none_compliant True if no requirements are YES
compliant_count Number of YES requirements
incompliant_count Number of non-YES requirements

Method:

Method Description
filter(result) Returns requirements matching the given result

Group-level checks example:

{% if group('audit_rights') %}
## Audit Rights ✓
All audit requirements are satisfied.
{% else %}
## Audit Rights ({{ group('audit_rights').incompliant_count }} issues)

{% for r in group('audit_rights').filter(NO) %}
- ❌ **{{ r.label }}**: {{ r.reasoning }}
{% endfor %}

{% for r in group('audit_rights').filter(PARTIAL) %}
- ⚠ **{{ r.label }}**: {{ r.human_notes or r.reasoning }}
{% endfor %}
{% endif %}

Classification Constants

The following constants are available for comparison:

Constant Description
YES Requirement is satisfied
NO Requirement is not satisfied
PARTIAL Requirement is partially satisfied
NOT_APPLICABLE Requirement does not apply
PENDING Not yet classified

Using constants in conditions:

{% if req('DORA-30-3e').result == NOT_APPLICABLE %}
_This requirement does not apply to this agreement._
{% elif req('DORA-30-3e').result == PARTIAL %}
**Partial**: {{ req('DORA-30-3e').reasoning }}
{% endif %}

Complete Report Example

# Compliance Review Report

## Summary

{% if group('core_requirements') %}
**Core Requirements**: All {{ group('core_requirements').compliant_count }} requirements met ✓
{% else %}
**Core Requirements**: {{ group('core_requirements').incompliant_count }} of {{ group('core_requirements').requirements|length }} issues found
{% endif %}

## Detailed Findings

### Core Processing Requirements

{% for r in group('core_requirements').requirements %}
#### {{ r.label }}

{% if r.result == YES %}
✓ **Compliant** ({{ "%.0f"|format(r.confidence * 100) }}% confidence)
{{ r.reasoning }}
{% elif r.result == PARTIAL %}
⚠ **Partial** - {{ r.reasoning }}
{% if r.human_notes %}
_Reviewer: {{ r.human_notes }}_
{% endif %}
{% elif r.result == NO %}
❌ **Non-compliant** - {{ r.reasoning }}
{% endif %}

{% endfor %}

{% if allows_third_country_transfer %}
### International Transfer Requirements

{% if not group('transfers') %}
**Action Required**: {{ group('transfers').incompliant_count }} transfer requirement(s) need attention.
{% endif %}
{% endif %}

Reference Sources

Requirements can link to regulatory source texts via the reference field. This enables the system to retrieve authoritative regulatory text to enrich search queries and improve AI classification accuracy.

Reference Format

The reference field uses the format <source_id>:<part_id>:

groups:
  audit:
    title: "Audit Rights"
    requirements:
      DORA-30-3e:
        label: "Audit rights"
        description: "The contract must grant audit and inspection rights."
        reference: "dora_2022_2554:art30.3(e)"

For multiple references, use a list:

reference:
  - "dora_2022_2554:art30.3(e)"
  - "dora_2022_2554:rec71"
  - "eba_guidelines:gl29"

Importing Reference Sources

Before references can be resolved, the regulatory source documents must be imported into the database:

vibe review import-references sources.json --embedding-provider local

The JSON file should contain the source metadata and its parts:

{
  "source": {
    "id": "dora_2022_2554",
    "language": "en",
    "title": "Regulation (EU) 2022/2554 (DORA)",
    "type": "regulation",
    "reference": "EU 2022/2554"
  },
  "parts": [
    {
      "part_id": "art30.3(e)",
      "part_type": "article",
      "title": "Article 30(3)(e)",
      "text": "The full regulatory text for this article...",
      "hierarchy": ["Chapter IV", "Article 30"]
    },
    {
      "part_id": "rec71",
      "part_type": "recital",
      "title": "Recital 71",
      "text": "The full text of recital 71..."
    }
  ]
}

How References Are Used

When references are configured and the source documents are imported:

  1. Reference resolution: The system matches <source_id>:<part_id> to database entries
  2. Text population: The reference_text field is automatically populated with the official regulatory text
  3. Enhanced search: Regulatory text is used alongside the requirement description for similarity search
  4. Rich LLM context: The AI classifier receives both the requirement description and the authoritative source text, improving classification accuracy

Few-Shot Examples

VIBE Review uses few-shot learning to improve classification accuracy. Examples are curated document excerpts with known classifications that help the AI understand how to evaluate similar content.

How Examples Work

When classifying a requirement, the system:

  1. Retrieves relevant examples for that requirement from the examples database
  2. Reranks by similarity to the current document excerpt
  3. Includes diverse examples (mixing YES/NO/PARTIAL classifications) in the AI prompt
  4. Records which examples were used for transparency and debugging

Creating Examples

There are two ways to create examples:

  1. Promote from matched parts: After AI classification, click the 🎯 icon next to any matched section to open the "Save as Example" form. The form pre-fills with the document excerpt and current classification.

  2. Manual creation: Navigate to the Examples page (via the "Examples" nav link) and create examples directly with custom excerpts.

Each example includes:

Field Description
requirement_id The requirement this example applies to
document_excerpt The relevant text from a document
classification YES, NO, PARTIAL, or NOT_APPLICABLE
reasoning Explanation of why this classification applies
quality_score 0.0-1.0 score indicating example quality (higher = better)

Managing Examples

The Examples page (/<template_id>/examples) provides:

  • Filtering by requirement, classification, or minimum quality score
  • Editing excerpts, reasoning, and quality scores
  • Deleting outdated or poor-quality examples

In development mode, a button appears after AI classification showing which examples were used, helping you understand and improve the example corpus.