Skip to content

Review Module Architecture

Architectural reference for VIBE's document compliance review system. Optimized for LLM consumption.

See also:

  • core.md - Core VIBE engine
  • components.md - Component system details
  • assistant.md - AI-assisted interview system
  • parsing-pipeline.md - Document parsing pipeline (4-layer architecture)

1. SYSTEM OVERVIEW

The Review module enables AI-assisted document compliance review against structured requirement frameworks. Unlike VIBE's core interview mode (which generates documents), Review mode analyzes uploaded documents against predefined requirements (e.g., DORA ICT contract compliance).

1.1 Core Philosophy

  • A review template consists of requirements plus a reporting template (template.md/template.docx)
  • Requirements use VIBE Core patterns: only relevant requirements need assessment (determined by template probing)
  • Review is a template capability (interview_mode: review), not a standalone module
  • Entry point: /interview/<template_id>/ redirects to /review/<template_id>/sessions

1.2 Key Capabilities

  • Document Processing: Multi-document upload (Markdown, DOCX, PDF); documents segmented into parts with hierarchy and stable IDs (content-hash based)
  • Hybrid Search: BM25 keyword + semantic embeddings retrieve candidate parts per requirement
  • Two-Stage LLM Classification: Relevance filtering, then compliance evaluation (YES/NO/PARTIAL)
  • Human-in-the-Loop: AI suggests classifications; humans verify/override; verified reviews promote to few-shot examples
  • Output: Excel export (any state) and template-based report generation (finished reviews)

2. DATA MODEL

2.1 Database Models

Location: vibe/review/models.py

Model Purpose
ReviewSessionModel Session linking template + documents (status: pending → processing → ready)
DocumentModel Uploaded document metadata and status
DocumentPartModel Document segments with embeddings, stable IDs, hierarchy metadata
RequirementReviewModel Classification result per requirement (result, confidence, reasoning, supporting parts)
QuestionReviewModel AI/human answers for template questions
ExampleModel Few-shot examples for LLM classification
RequirementCacheModel Cached requirement definitions with embeddings
ReferenceSourceModel Regulatory source documents (DORA, EBA, etc.)
ReferencePartModel Segments of regulatory sources

Classification Results: YES, NO, PARTIAL, NOT_APPLICABLE, PENDING

2.2 Requirements Definition

Location: vibe/review/requirements.py

Requirements defined in template config.yml under unified groups format with context questions and requirements co-located:

groups:
  audit_rights:
    title: Audit Rights
    questions:
      critical_service: { type: bool, label: Critical or Important Function }
    requirements:
      D5-1:
        label: Unlimited audit rights
        description: Full access to inspect ICT service provider

Template-Driven Applicability: {% if critical_service %}{{ req('D5-1') }}{% endif %}

Key Classes: RequirementLoader, RequirementSet, Requirement, probe_template_for_requirements()

3. DOCUMENT INGESTION

Location: vibe/review/ingestion.py, vibe/review/document_sources.py

Two-Phase Process:

  1. Upload (create_session) -- Store documents in filestore, create session with status=pending
  2. Ingest (stream_session_ingestion) -- Parse documents, compute embeddings via SSE stream

Document Sources: The DocumentSource protocol (document_sources.py) provides per-format abstraction with MarkdownSource, PdfSource, DocxSource implementations. Factory: create_document_source().

Ingestion Pipeline:

1. Detect content type → create DocumentSource
2. Route to Parsing Pipeline (see parsing-pipeline.md)
3. Convert SemanticUnits to DocumentPartModels
4. Detect language (for multilingual embedding/prompts)
5. Generate embeddings (batch)
6. Store parts with metadata (hierarchy, bounding boxes)

Filestore: (vibe/review/filestore.py) Content-addressed binary storage. store_bytes(data, suffix) → (sha256, path). Documents reference binaries via doc_metadata.filestore.sha256.

4. CONFIGURATION

The Review module resolves configuration from the app_config mapping passed to ReviewService and ReviewProviderFactory:

  • Database: DATABASE_URL from app config or environment
  • OCR: review_backends.ocr in app config (backend, dpi, text_layer_min_chars)
  • DOCX Conversion: review_backends.docx in app config (backend)
  • LLM/Embedding/Rerank: Resolved via ReviewProviderFactory from template-level config
  • Caching: OCR, PDF, and DOCX render caches via helper functions

5. CLASSIFICATION PIPELINE

5.1 Retrieval

Location: vibe/review/retrieval/, vibe/review/hybrid_search.py

1. Generate query from requirement (label + description + help text)
2. BM25 + Embedding similarity search (HybridSearcher with RRF fusion)
3. Cross-encoder reranking (Reranker) for precision

5.2 Two-Stage LLM Classification

Location: vibe/review/classifier.py::RequirementClassifier

Stage 1 -- Relevance: For each retrieved part, LLM determines R (relevant) or N (not relevant). Stage 2 -- Compliance: For each relevant part, LLM evaluates YES/NO/PARTIAL with confidence and reasoning. Results aggregated for final classification.

5.3 Batch Classification

Location: vibe/review/batch.py::BatchClassifier

Orchestrates classification of all applicable requirements via SSE stream: load applicable requirements (template probe) → for each: retrieve → classify → store → emit progress.

5.4 Unified Assessment

Location: vibe/review/assessment.py

AssessmentClassifier unifies question-answering and requirement-classification pipelines. AssessmentItem and AssessmentResult provide a common interface for both assessment types.

5.5 Few-Shot Examples

Location: vibe/review/retrieval/example_retriever.py

ExampleRetriever finds similar examples from database. Users can promote matched document parts to examples from the workbench, creating a feedback loop where human-verified classifications become training data.

6. PROMPT SYSTEM

Location: vibe/review/templates/prompts/

Naming Convention: {type}_{role}_{lang}.jinja2

  • Types: relevance, compliance, batch, question
  • Roles: system, user
  • Languages: en, sv, de, es, fr
  • Partials prefixed with _ (e.g., _requirement_info_en.jinja2)

Prompt Building: vibe/review/prompts.py::PromptBuilder constructs prompts with requirement details, document context, few-shot examples, and output format instructions.

LLM Client: vibe/review/llm.py::BaseLLMClient -- Separate from vibe.providers.llm by design (review needs structured JSON output, not streaming).

7. SERVICE LAYER

7.1 ReviewService

Location: vibe/review/services/review_service.py::ReviewService

Facade providing stable interface for web routes. Instantiated per-request with db_session, template_provider, and app_config: Mapping[str, Any] | None (decoupled from Flask globals).

Responsibility Areas:

  • Session lifecycle: list_sessions(limit), get_session, create_session, delete_session, stream_session_ingestion
  • Requirements: get_template_requirement_set, load_session_requirements, get_requirement_groups, get_classification_stats
  • Classification: classify_single_requirement, stream_ai_assessment, stream_batch_classification, save_human_classification
  • Questions: get_template_questions, suggest_question_answer, save_human_question_answer
  • Documents: get_matched_parts, render_document_html, open_document_binary_for_download, export_results_xlsx
  • Examples: list_examples, get_example, update_example, delete_example
  • Reports: render_report, build_template_context

Composed Services: ReviewAnalyticsService (vibe/review/services/analytics_service.py) -- accuracy calculation from human override patterns.

7.2 Provider Factory

Location: vibe/review/services/provider_factory.py::ReviewProviderFactory

Creates embedding, reranking, and LLM providers based on template config. OCR backend creation happens in ReviewService._create_cached_ocr_backend().

7.3 Progress Reporting

Location: vibe/review/progress.py

BaseProgress base class for all SSE progress updates in long-running operations (ingestion, classification).

8. WEB INTERFACE

Location: vibe/review/web/routes.py (Blueprint under /review/)

Workbench Layout: Three-pane UI:

  1. Left sidebar: Assessment items (questions + requirements) with status badges
  2. Center: Document viewer with part highlighting and navigation
  3. Right panel: Item detail with classification form or question input

htmx Interactions: Click item → loads detail panel; "Run AI" → classification via OOB swap; save → updates sidebar row; document part links → scroll viewer.

Templates: 12 top-level templates + 20 partials in vibe/review/templates/review/.

9. ENTRY FLOW & URL STRUCTURE

Interview Mode Extension: vibe/review/interview_mode.py::ReviewModeExtension intercepts interview_mode: review templates and redirects to review UI.

URL Pattern: All routes include <template_id> for template-scoped review:

/review/                                          # Global index
/review/<template_id>/sessions                    # Sessions list
/review/<template_id>/new                         # Upload form (GET/POST)
/review/<template_id>/<session_id>/ingest         # Ingestion progress (SSE)
/review/<template_id>/<session_id>                # Workbench
/review/<template_id>/<session_id>/item/...       # Assessment item partials
/review/<template_id>/<session_id>/requirement/...# Classification endpoints
/review/<template_id>/<session_id>/export         # Excel export

10. DATA FLOW DIAGRAMS

10.1 New Review Session

User uploads document(s) via /review/<template_id>/new
    → Store binaries in filestore
    → Create ReviewSessionModel (status=pending) + DocumentModels
    → Redirect to ingest route

SSE ingestion stream:
    → For each document:
        → create_document_source() → DocumentSource
        → Parse via pipeline, segment into parts with stable IDs
        → Detect language, compute embeddings (batch)
        → SSE progress events
    → Update session status to ready
    → SSE: complete (redirect to workbench)

10.2 Single Requirement Classification

User clicks "Run AI" for requirement D5-1
    → ReviewService.classify_single_requirement
        → HybridSearcher.search (BM25 + semantic) → Reranker.rerank
        → RequirementClassifier.classify:
            → Relevance check (LLM) per candidate part
            → Compliance check (LLM) for relevant parts
            → Aggregate → final classification
        → Store RequirementReviewModel
    → htmx: updated detail panel + sidebar row (OOB)

10.3 Context Question AI Suggestion

User clicks "Suggest" for question
    → HybridSearcher.search (query from question label/help)
    → QuestionAnswerer.answer (LLM structured output)
    → Store suggestion in session.suggestions
User clicks "Accept"
    → Copy to session.context, clear suggestion
    → Re-probe template (new context may change applicable requirements)

11. FILE LOCATION INDEX

What Where
Database models vibe/review/models.py
Requirements loading vibe/review/requirements.py::RequirementLoader
Template probing vibe/review/template_functions.py::probe_template_for_requirements
Document ingestion vibe/review/ingestion.py::DocumentIngester
Document sources vibe/review/document_sources.py::DocumentSource, create_document_source
Parsing pipeline vibe/review/parsing/ (see parsing-pipeline.md)
Filestore vibe/review/filestore.py
OCR backends vibe/review/parsing/extraction/ocr/ (package: backend, extractor, analysis, health)
DOCX→PDF rendering vibe/review/docx_converter.py
Hybrid search vibe/review/hybrid_search.py::HybridSearcher
Retrieval vibe/review/retrieval/search.py::ReferenceSearcher
Reranking vibe/review/retrieval/reranker.py::PartReranker
Example retrieval vibe/review/retrieval/example_retriever.py::ExampleRetriever
Requirement classifier vibe/review/classifier.py::RequirementClassifier
Batch orchestration vibe/review/batch.py::BatchClassifier
Assessment vibe/review/assessment.py::AssessmentClassifier
Question answering vibe/review/question_answerer.py::QuestionAnswerer
Progress base vibe/review/progress.py::BaseProgress
Prompt building vibe/review/prompts.py::PromptBuilder
Prompt templates vibe/review/templates/prompts/*.jinja2
LLM client vibe/review/llm.py::BaseLLMClient
Embedding providers vibe/providers/embedding/ (shared)
Rerank providers vibe/providers/reranking/ (shared)
ReviewService vibe/review/services/review_service.py
Analytics service vibe/review/services/analytics_service.py
Provider factory vibe/review/services/provider_factory.py
Database utilities vibe/review/database.py
Routes vibe/review/web/routes.py
Interview mode ext. vibe/review/interview_mode.py::ReviewModeExtension
HTML templates vibe/review/templates/review/
Reference linking vibe/review/reference_linker.py::ReferenceLinker
CLI commands vibe/review/cli.py

Document Version: 2.0 Last Updated: 2026-02-02 Notes: Fixed config, method signatures, provider paths, prompt naming; condensed method tables, route tables, data flows; added document_sources, assessment, progress