Document Review¶
Extension note: Document Review is an optional extension and requires additional dependencies (install the runtime-review group). Core VIBE runs without it.
VIBE Review extends the document assembly platform to support structured document compliance review. Instead of generating documents through interviews, Review templates evaluate existing documents against a set of requirements.
What is Document Review?¶
Document Review helps professionals systematically evaluate documents (contracts, policies, assessments) against requirements. Common use cases:
- DORA Compliance: Review ICT contracts against regulatory requirements
- GDPR Compliance: Evaluate data processing agreements against Article 28.3 requirements
- Security Assessments: Check documentation against ISO 27001 or SOC 2 controls
- Vendor Due Diligence: Verify contracts against internal policies
The system combines:
- Human-authored requirements as the source of truth
- AI-assisted retrieval to find relevant document parts
- LLM evaluation for preliminary classification
- Human decision-making for final determinations
Creating a Review Template¶
Review templates use the standard VIBE template structure with interview_mode: review in the configuration.
Basic Structure¶
gdpr-dpa-review/
├── config.yml # Requirements and configuration
├── template.md # Report template (uses req() to control relevance)
└── components/ # Optional reusable sections
Defining Requirements¶
Requirements are organized into groups - logical collections of related requirements. The groups key is top-level, with group IDs underneath, and requirements nested within each group:
template_name: "GDPR DPA Compliance Review"
interview_mode: review
description: >
Review data processing agreements for compliance with GDPR Article 28.3.
# AI provider configuration (optional, references global providers)
review:
embedding: local # key from embedding_providers
reranking: local # key from rerank_providers
evaluation: claude # key from llm_endpoints
# Requirements organized by group
groups:
instructions:
title: "Documented Instructions"
description: "Requirements related to processing instructions"
requirements:
GDPR-28-3a:
label: "Documented instructions"
description: |
The processor must only process personal data on documented
instructions from the controller.
reference: "gdpr_2016_679:art28.3(a)"
help: |
Check that the agreement:
- Requires processing only on controller's documented instructions
- Covers transfers outside EU/EEA
YES: "documented instructions", "written instructions only"
NO: "processor's discretion", "as processor deems necessary"
GDPR-28-3b:
label: "Confidentiality obligations"
description: |
Persons authorized to process personal data have committed
to confidentiality or are under statutory obligation.
reference:
- "gdpr_2016_679:art28.3(b)"
- "gdpr_2016_679:rec83"
help: |
Check for confidentiality commitments from personnel.
YES: "confidentiality agreement", "bound by confidentiality"
NO: No mention of personnel confidentiality
Requirement Structure¶
Each requirement has:
| Field | Required | Description |
|---|---|---|
label |
Yes | Short label shown in requirement lists |
description |
Yes | Full requirement text explaining what to evaluate |
reference |
No | Regulatory source reference(s) - see "Reference Sources" below |
reference_text |
No | Full regulatory text excerpt (auto-populated from reference if imported) |
help |
No | Evaluation guidance with YES/NO indicators |
Context Questions¶
Review templates support standard VIBE questions alongside requirements. Questions serve two purposes:
- Capture document context - Facts about the agreement that affect which requirements apply
- AI-assisted answering - The system can analyze the document and suggest answers
Define questions in your config, either at top-level or within groups:
groups:
transfers:
title: "International Transfers"
description: "Requirements for transfers outside EU/EEA"
questions:
allows_third_country_transfer:
type: bool
label: "Does the agreement permit transfers to third countries?"
help: |
Check if the processor is allowed to transfer personal data
outside the EU/EEA. Look for clauses about subprocessors
in non-EU countries or data center locations.
transfer_mechanism:
type: enum
label: "What transfer mechanism is specified?"
options:
- adequacy: "Adequacy decision (Art. 45)"
- scc: "Standard Contractual Clauses (Art. 46.2c)"
- bcr: "Binding Corporate Rules (Art. 47)"
- derogation: "Derogation (Art. 49)"
- none: "None specified"
condition: allows_third_country_transfer
requirements:
GDPR-28-transfer-mechanism:
label: "Valid transfer mechanism"
description: |
Transfers to third countries must be based on an adequacy
decision, appropriate safeguards, or a specific derogation.
reference: "gdpr_2016_679:art46"
When reviewing, users can click "Suggest Answer" on any question. The AI analyzes the document and proposes an answer with supporting evidence from the text.
Requirement Relevance¶
Not all requirements apply to every document. Question answers determine which requirements are relevant through the template.
The Mental Model¶
Requirements become relevant through the template, not through configuration:
- Answer questions that capture document context (manually or via AI suggestion)
- Template probing executes the template with current answers
- Conditional
req()calls produce only the relevant requirements - Re-probing on change - when answers change, requirements update automatically
This is the same "template is truth" principle used for interview questions.
Example: GDPR Third Country Transfers¶
A data processing agreement review might have requirements that only apply when international transfers are involved:
{# template.md - Controls which requirements are relevant #}
# GDPR Article 28 Compliance Review
## Core Processing Requirements
{{ req("GDPR-28-3a") }} {# Documented instructions - always applies #}
{{ req("GDPR-28-3b") }} {# Confidentiality - always applies #}
{% if allows_third_country_transfer %}
## International Transfer Requirements
The agreement permits transfers to third countries.
{% if transfer_mechanism == "adequacy" %}
{{ req("GDPR-45-adequacy") }}
{% elif transfer_mechanism == "scc" %}
{{ req("GDPR-46-scc") }}
{{ req("GDPR-46-supplementary") }}
{% elif transfer_mechanism == "bcr" %}
{{ req("GDPR-47-bcr") }}
{% else %}
{{ req("GDPR-49-derogation") }}
{% endif %}
{{ req("GDPR-28-transfer-mechanism") }}
{% endif %}
In this example:
- Core requirements (28-3a, 28-3b) always apply
- Transfer requirements only appear if allows_third_country_transfer is true
- The specific transfer mechanism requirements depend on which legal basis is used
- If the reviewer changes an answer, the requirement list updates immediately
How Review Works¶
1. Document Ingestion¶
VIBE Review supports Markdown, Word (DOCX), and PDF files.
- DOCX: Rendered to PDF for layout-faithful viewing.
- PDF: Processed using high-fidelity text extraction.
- Scanned PDFs: Automatically processed via OCR (Optical Character Recognition).
The system segments documents into semantically meaningful parts (headings, paragraphs, tables) with stable identifiers.
2. Requirement Matching¶
For each relevant requirement, the system uses Hybrid Search (combining BM25 keyword matching and semantic vector similarity) to find the most relevant document parts. Results are further refined using a Cross-Encoder Reranker for maximum precision.
3. AI Classification¶
For each requirement, the AI performs a two-stage evaluation:
- Relevance Filtering: Determines if matched parts actually address the requirement.
- Compliance Evaluation: Assesses whether the relevant parts satisfy the requirement (YES/NO/PARTIAL/NOT_APPLICABLE).
Each stage produces a confidence level (High/Medium/Low) and reasoning. The UI displays the overall confidence alongside the classification, helping reviewers prioritize which items need closer attention.
Classifications are tracked with metadata:
| Field | Description |
|---|---|
is_ai_suggested |
Whether the classification came from AI |
is_human_verified |
Whether a human has reviewed/confirmed the classification |
confidence |
Numeric confidence score (0.0-1.0) |
reasoning |
AI explanation for the classification |
human_notes |
Optional notes added by the human reviewer |
4. AI-Assisted Questions¶
Context questions can also be answered by the AI. When you click "Suggest Answer", the system analyzes the document to determine the correct value for your context questions, speeding up the initial assessment phase.
5. Matched Parts Curation¶
After AI classification, users can refine the matched document parts:
- Remove irrelevant parts: Click the trash icon next to any matched section to remove it from the assessment.
- Add missing parts: Hover over any document part in the viewer and click the "+" button to add it to the current assessment item.
- Curated indicator: When parts have been manually modified, a "(curated)" badge appears next to "Matched Sections".
When you re-run AI classification on an item with curated parts, the system skips the search and reranking stages and uses your curated parts directly. This lets you correct AI mistakes and ensure the classification uses exactly the document sections you've identified.
To reset to automatic part selection, remove all parts and re-run AI classification.
Human-in-the-Loop Workbench¶
The Review Workbench provides a multi-pane interface for efficient review:
┌──────────────────────────────────────────────────────────┐
│ ┌───────────┐ ┌────────────────────────────────────────┐ │
│ │ Sidebar │ │ Main Container │ │
│ │ │ ├──────────────────┬─────────────────────┤ │
│ │ Questions │ │ Detail Panel │ Document Viewer │ │
│ │ + │ │ │ │ │
│ │ Req List │ │ Selected item, │ Tabs for multiple │ │
│ │ │ │ AI reasoning, │ documents, matched │ │
│ │ │ │ classification │ parts highlighted │ │
│ └───────────┘ └──────────────────┴─────────────────────┘ │
└──────────────────────────────────────────────────────────┘
- Left Sidebar: A unified stream of context questions and compliance requirements. Click any item to select it.
- Detail Panel: Shows the selected question or requirement with AI reasoning, matched document sections, and classification controls.
- Document Viewer: Displays the document with high-fidelity rendering. Supports multiple documents via tabs. Matched parts are highlighted and clickable to scroll into view.
A draggable divider between the detail panel and document viewer allows you to adjust the column widths.
Template Output¶
The template.md (or template.docx) defines the final report. The following variables are available in the context:
| Variable | Description |
|---|---|
review_session |
Metadata about the session (ID, status, created_at, updated_at as ISO timestamps) |
req(id) |
Returns a RequirementProxy with boolean logic and property access (see below) |
group(id) |
Returns a RequirementGroupProxy for group-level compliance checks (see below) |
YES, NO, PARTIAL, NOT_APPLICABLE, PENDING |
Classification constants for comparison |
requirements |
Dictionary of requirement IDs to classification results (legacy, prefer req()) |
satisfied(id) |
Function returning true if requirement ID is YES or PARTIAL (legacy, prefer req()) |
classification(id) |
Function returning the classification string (legacy, prefer req().result) |
documents |
List of uploaded documents with their metadata |
[question_id] |
Answers to context questions are available at the top level |
Requirement Result Fields¶
Each entry in the requirements dictionary contains:
| Field | Description |
|---|---|
classification |
Uppercase classification: YES, NO, PARTIAL, NOT_APPLICABLE, or PENDING |
confidence |
Numeric confidence score (0.0-1.0) |
reasoning |
AI or human explanation for the classification |
is_ai_suggested |
Boolean indicating if this came from AI |
is_human_verified |
Boolean indicating if a human confirmed the result |
human_notes |
Optional notes added by the reviewer |
Example Report Snippet¶
## Detailed Findings
{% if satisfied("GDPR-28-3a") %}
- **Documented Instructions**: Found. {{ requirements["GDPR-28-3a"].reasoning }}
{% else %}
- **Documented Instructions**: MISSING. The agreement lacks instructions.
{% endif %}
Requirement and Group Proxies¶
For more expressive templates, use req() and group() functions that return rich proxy objects with boolean logic and property access.
The req() Function¶
req('id') returns a RequirementProxy object that:
- Is truthy when compliant -
{% if req('D2-1') %}is true only whenresult == YES - Exposes all classification fields as properties
- Renders as a formatted string in output context
| Property | Description |
|---|---|
id |
Requirement ID (e.g., "GDPR-28-3a") |
label |
Short label from config |
result |
Classification result (YES, NO, PARTIAL, NOT_APPLICABLE, PENDING) |
confidence |
Numeric score (0.0-1.0) |
reasoning |
AI or human explanation |
human_notes |
Reviewer's notes |
is_ai_suggested |
True if classification came from AI |
is_human_verified |
True if human confirmed |
Boolean semantics example:
{# Only true when result == YES #}
{% if req('GDPR-28-3a') %}
✓ Documented instructions requirement is fully satisfied.
{% endif %}
{# Use 'not' to check for non-compliance #}
{% if not req('GDPR-28-3a') %}
⚠ Missing or incomplete: Add clause for documented instructions.
{% endif %}
Property access example:
{% if req('GDPR-28-3a').result == PARTIAL %}
**Partial compliance**: {{ req('GDPR-28-3a').reasoning }}
{% if req('GDPR-28-3a').human_notes %}
Reviewer notes: {{ req('GDPR-28-3a').human_notes }}
{% endif %}
{% endif %}
The group() Function¶
group('id') returns a RequirementGroupProxy for group-level compliance checks:
- Is truthy when all requirements are compliant (all have
result == YES) - Provides aggregate properties for compliance counts
- Supports filtering by classification result
| Property | Description |
|---|---|
id |
Group ID from config |
title |
Group title |
requirements |
List of RequirementProxy objects in the group |
all_compliant |
True if all requirements are YES |
none_compliant |
True if no requirements are YES |
compliant_count |
Number of YES requirements |
incompliant_count |
Number of non-YES requirements |
Method:
| Method | Description |
|---|---|
filter(result) |
Returns requirements matching the given result |
Group-level checks example:
{% if group('audit_rights') %}
## Audit Rights ✓
All audit requirements are satisfied.
{% else %}
## Audit Rights ({{ group('audit_rights').incompliant_count }} issues)
{% for r in group('audit_rights').filter(NO) %}
- ❌ **{{ r.label }}**: {{ r.reasoning }}
{% endfor %}
{% for r in group('audit_rights').filter(PARTIAL) %}
- ⚠ **{{ r.label }}**: {{ r.human_notes or r.reasoning }}
{% endfor %}
{% endif %}
Classification Constants¶
The following constants are available for comparison:
| Constant | Description |
|---|---|
YES |
Requirement is satisfied |
NO |
Requirement is not satisfied |
PARTIAL |
Requirement is partially satisfied |
NOT_APPLICABLE |
Requirement does not apply |
PENDING |
Not yet classified |
Using constants in conditions:
{% if req('DORA-30-3e').result == NOT_APPLICABLE %}
_This requirement does not apply to this agreement._
{% elif req('DORA-30-3e').result == PARTIAL %}
**Partial**: {{ req('DORA-30-3e').reasoning }}
{% endif %}
Complete Report Example¶
# Compliance Review Report
## Summary
{% if group('core_requirements') %}
**Core Requirements**: All {{ group('core_requirements').compliant_count }} requirements met ✓
{% else %}
**Core Requirements**: {{ group('core_requirements').incompliant_count }} of {{ group('core_requirements').requirements|length }} issues found
{% endif %}
## Detailed Findings
### Core Processing Requirements
{% for r in group('core_requirements').requirements %}
#### {{ r.label }}
{% if r.result == YES %}
✓ **Compliant** ({{ "%.0f"|format(r.confidence * 100) }}% confidence)
{{ r.reasoning }}
{% elif r.result == PARTIAL %}
⚠ **Partial** - {{ r.reasoning }}
{% if r.human_notes %}
_Reviewer: {{ r.human_notes }}_
{% endif %}
{% elif r.result == NO %}
❌ **Non-compliant** - {{ r.reasoning }}
{% endif %}
{% endfor %}
{% if allows_third_country_transfer %}
### International Transfer Requirements
{% if not group('transfers') %}
**Action Required**: {{ group('transfers').incompliant_count }} transfer requirement(s) need attention.
{% endif %}
{% endif %}
Reference Sources¶
Requirements can link to regulatory source texts via the reference field. This enables the system to retrieve authoritative regulatory text to enrich search queries and improve AI classification accuracy.
Reference Format¶
The reference field uses the format <source_id>:<part_id>:
groups:
audit:
title: "Audit Rights"
requirements:
DORA-30-3e:
label: "Audit rights"
description: "The contract must grant audit and inspection rights."
reference: "dora_2022_2554:art30.3(e)"
For multiple references, use a list:
reference:
- "dora_2022_2554:art30.3(e)"
- "dora_2022_2554:rec71"
- "eba_guidelines:gl29"
Importing Reference Sources¶
Before references can be resolved, the regulatory source documents must be imported into the database:
vibe review import-references sources.json --embedding-provider local
The JSON file should contain the source metadata and its parts:
{
"source": {
"id": "dora_2022_2554",
"language": "en",
"title": "Regulation (EU) 2022/2554 (DORA)",
"type": "regulation",
"reference": "EU 2022/2554"
},
"parts": [
{
"part_id": "art30.3(e)",
"part_type": "article",
"title": "Article 30(3)(e)",
"text": "The full regulatory text for this article...",
"hierarchy": ["Chapter IV", "Article 30"]
},
{
"part_id": "rec71",
"part_type": "recital",
"title": "Recital 71",
"text": "The full text of recital 71..."
}
]
}
How References Are Used¶
When references are configured and the source documents are imported:
- Reference resolution: The system matches
<source_id>:<part_id>to database entries - Text population: The
reference_textfield is automatically populated with the official regulatory text - Enhanced search: Regulatory text is used alongside the requirement description for similarity search
- Rich LLM context: The AI classifier receives both the requirement description and the authoritative source text, improving classification accuracy
Few-Shot Examples¶
VIBE Review uses few-shot learning to improve classification accuracy. Examples are curated document excerpts with known classifications that help the AI understand how to evaluate similar content.
How Examples Work¶
When classifying a requirement, the system:
- Retrieves relevant examples for that requirement from the examples database
- Reranks by similarity to the current document excerpt
- Includes diverse examples (mixing YES/NO/PARTIAL classifications) in the AI prompt
- Records which examples were used for transparency and debugging
Creating Examples¶
There are two ways to create examples:
-
Promote from matched parts: After AI classification, click the 🎯 icon next to any matched section to open the "Save as Example" form. The form pre-fills with the document excerpt and current classification.
-
Manual creation: Navigate to the Examples page (via the "Examples" nav link) and create examples directly with custom excerpts.
Each example includes:
| Field | Description |
|---|---|
requirement_id |
The requirement this example applies to |
document_excerpt |
The relevant text from a document |
classification |
YES, NO, PARTIAL, or NOT_APPLICABLE |
reasoning |
Explanation of why this classification applies |
quality_score |
0.0-1.0 score indicating example quality (higher = better) |
Managing Examples¶
The Examples page (/<template_id>/examples) provides:
- Filtering by requirement, classification, or minimum quality score
- Editing excerpts, reasoning, and quality scores
- Deleting outdated or poor-quality examples
In development mode, a button appears after AI classification showing which examples were used, helping you understand and improve the example corpus.