Review Module Architecture¶

Architectural reference for VIBE's document compliance review system. Optimized for LLM consumption.

See also:

core.md - Core VIBE engine
components.md - Component system details
assistant.md - AI-assisted interview system
parsing-pipeline.md - Document parsing pipeline (4-layer architecture)

1. SYSTEM OVERVIEW¶

The Review module enables AI-assisted document compliance review against structured requirement frameworks. Unlike VIBE's core interview mode (which generates documents), Review mode analyzes uploaded documents against predefined requirements (e.g., DORA ICT contract compliance).

1.1 Core Philosophy¶

A review template consists of requirements plus a reporting template (template.md/template.docx)
Requirements use VIBE Core patterns: only relevant requirements need assessment (determined by template probing)
Review is a template capability (interview_mode: review), not a standalone module
Entry point: /interview/<template_id>/ redirects to /review/<template_id>/sessions

1.2 Key Capabilities¶

Document Processing: Multi-document upload (Markdown, DOCX, PDF); documents segmented into parts with hierarchy and stable IDs (content-hash based)
Hybrid Search: BM25 keyword + semantic embeddings retrieve candidate parts per requirement
Two-Stage LLM Classification: Relevance filtering, then compliance evaluation (YES/NO/PARTIAL)
Human-in-the-Loop: AI suggests classifications; humans verify/override; verified reviews promote to few-shot examples
Output: Excel export (any state) and template-based report generation (finished reviews)

2. DATA MODEL¶

2.1 Database Models¶

Location: vibe/review/models.py

Model	Purpose
`ReviewSessionModel`	Session linking template + documents (status: pending → processing → ready)
`DocumentModel`	Uploaded document metadata and status
`DocumentPartModel`	Document segments with embeddings, stable IDs, hierarchy metadata
`RequirementReviewModel`	Classification result per requirement (result, confidence, reasoning, supporting parts)
`QuestionReviewModel`	AI/human answers for template questions
`ExampleModel`	Few-shot examples for LLM classification
`RequirementCacheModel`	Cached requirement definitions with embeddings
`ReferenceSourceModel`	Regulatory source documents (DORA, EBA, etc.)
`ReferencePartModel`	Segments of regulatory sources

Classification Results: YES, NO, PARTIAL, NOT_APPLICABLE, PENDING

2.2 Requirements Definition¶

Location: vibe/review/requirements.py

Requirements defined in template config.yml under unified groups format with context questions and requirements co-located:

groups:
  audit_rights:
    title: Audit Rights
    questions:
      critical_service: { type: bool, label: Critical or Important Function }
    requirements:
      D5-1:
        label: Unlimited audit rights
        description: Full access to inspect ICT service provider

Template-Driven Applicability: {% if critical_service %}{{ req('D5-1') }}{% endif %}

Key Classes: RequirementLoader, RequirementSet, Requirement, probe_template_for_requirements()

3. DOCUMENT INGESTION¶

Location: vibe/review/ingestion.py, vibe/review/document_sources.py

Two-Phase Process:

Upload (create_session) -- Store documents in filestore, create session with status=pending
Ingest (stream_session_ingestion) -- Parse documents, compute embeddings via SSE stream

Document Sources: The DocumentSource protocol (document_sources.py) provides per-format abstraction with MarkdownSource, PdfSource, DocxSource implementations. Factory: create_document_source().

Ingestion Pipeline:

1. Detect content type → create DocumentSource
2. Route to Parsing Pipeline (see parsing-pipeline.md)
3. Convert SemanticUnits to DocumentPartModels
4. Detect language (for multilingual embedding/prompts)
5. Generate embeddings (batch)
6. Store parts with metadata (hierarchy, bounding boxes)

Filestore: (vibe/review/filestore.py) Content-addressed binary storage. store_bytes(data, suffix) → (sha256, path). Documents reference binaries via doc_metadata.filestore.sha256.

4. CONFIGURATION¶

The Review module resolves configuration from the app_config mapping passed to ReviewService and ReviewProviderFactory:

Database: DATABASE_URL from app config or environment
OCR: review_backends.ocr in app config (backend, dpi, text_layer_min_chars)
DOCX Conversion: review_backends.docx in app config (backend)
LLM/Embedding/Rerank: Resolved via ReviewProviderFactory from template-level config
Caching: OCR, PDF, and DOCX render caches via helper functions

5. CLASSIFICATION PIPELINE¶

5.1 Retrieval¶

Location: vibe/review/retrieval/, vibe/review/hybrid_search.py

1. Generate query from requirement (label + description + help text)
2. BM25 + Embedding similarity search (HybridSearcher with RRF fusion)
3. Cross-encoder reranking (Reranker) for precision

5.2 Two-Stage LLM Classification¶

Location: vibe/review/classifier.py::RequirementClassifier

Stage 1 -- Relevance: For each retrieved part, LLM determines R (relevant) or N (not relevant). Stage 2 -- Compliance: For each relevant part, LLM evaluates YES/NO/PARTIAL with confidence and reasoning. Results aggregated for final classification.

5.3 Batch Classification¶

Location: vibe/review/batch.py::BatchClassifier

Orchestrates classification of all applicable requirements via SSE stream: load applicable requirements (template probe) → for each: retrieve → classify → store → emit progress.

5.4 Unified Assessment¶

Location: vibe/review/assessment.py

AssessmentClassifier unifies question-answering and requirement-classification pipelines. AssessmentItem and AssessmentResult provide a common interface for both assessment types.

5.5 Few-Shot Examples¶

Location: vibe/review/retrieval/example_retriever.py

ExampleRetriever finds similar examples from database. Users can promote matched document parts to examples from the workbench, creating a feedback loop where human-verified classifications become training data.

6. PROMPT SYSTEM¶

Location: vibe/review/templates/prompts/

Naming Convention: {type}_{role}_{lang}.jinja2

Types: relevance, compliance, batch, question
Roles: system, user
Languages: en, sv, de, es, fr
Partials prefixed with _ (e.g., _requirement_info_en.jinja2)

Prompt Building: vibe/review/prompts.py::PromptBuilder constructs prompts with requirement details, document context, few-shot examples, and output format instructions.

LLM Client: vibe/review/llm.py::BaseLLMClient -- Separate from vibe.providers.llm by design (review needs structured JSON output, not streaming).

7. SERVICE LAYER¶

7.1 ReviewService¶

Location: vibe/review/services/review_service.py::ReviewService

Facade providing stable interface for web routes. Instantiated per-request with db_session, template_provider, and app_config: Mapping[str, Any] | None (decoupled from Flask globals).

Responsibility Areas:

Session lifecycle: list_sessions(limit), get_session, create_session, delete_session, stream_session_ingestion
Requirements: get_template_requirement_set, load_session_requirements, get_requirement_groups, get_classification_stats
Classification: classify_single_requirement, stream_ai_assessment, stream_batch_classification, save_human_classification
Questions: get_template_questions, suggest_question_answer, save_human_question_answer
Documents: get_matched_parts, render_document_html, open_document_binary_for_download, export_results_xlsx
Examples: list_examples, get_example, update_example, delete_example
Reports: render_report, build_template_context

Composed Services: ReviewAnalyticsService (vibe/review/services/analytics_service.py) -- accuracy calculation from human override patterns.

7.2 Provider Factory¶

Location: vibe/review/services/provider_factory.py::ReviewProviderFactory

Creates embedding, reranking, and LLM providers based on template config. OCR backend creation happens in ReviewService._create_cached_ocr_backend().

7.3 Progress Reporting¶

Location: vibe/review/progress.py

BaseProgress base class for all SSE progress updates in long-running operations (ingestion, classification).

8. WEB INTERFACE¶

Location: vibe/review/web/routes.py (Blueprint under /review/)

Workbench Layout: Three-pane UI:

Left sidebar: Assessment items (questions + requirements) with status badges
Center: Document viewer with part highlighting and navigation
Right panel: Item detail with classification form or question input

htmx Interactions: Click item → loads detail panel; "Run AI" → classification via OOB swap; save → updates sidebar row; document part links → scroll viewer.

Templates: 12 top-level templates + 20 partials in vibe/review/templates/review/.

9. ENTRY FLOW & URL STRUCTURE¶

Interview Mode Extension: vibe/review/interview_mode.py::ReviewModeExtension intercepts interview_mode: review templates and redirects to review UI.

URL Pattern: All routes include <template_id> for template-scoped review:

/review/                                          # Global index
/review/<template_id>/sessions                    # Sessions list
/review/<template_id>/new                         # Upload form (GET/POST)
/review/<template_id>/<session_id>/ingest         # Ingestion progress (SSE)
/review/<template_id>/<session_id>                # Workbench
/review/<template_id>/<session_id>/item/...       # Assessment item partials
/review/<template_id>/<session_id>/requirement/...# Classification endpoints
/review/<template_id>/<session_id>/export         # Excel export

10. DATA FLOW DIAGRAMS¶

10.1 New Review Session¶

User uploads document(s) via /review/<template_id>/new
    → Store binaries in filestore
    → Create ReviewSessionModel (status=pending) + DocumentModels
    → Redirect to ingest route

SSE ingestion stream:
    → For each document:
        → create_document_source() → DocumentSource
        → Parse via pipeline, segment into parts with stable IDs
        → Detect language, compute embeddings (batch)
        → SSE progress events
    → Update session status to ready
    → SSE: complete (redirect to workbench)

10.2 Single Requirement Classification¶

User clicks "Run AI" for requirement D5-1
    → ReviewService.classify_single_requirement
        → HybridSearcher.search (BM25 + semantic) → Reranker.rerank
        → RequirementClassifier.classify:
            → Relevance check (LLM) per candidate part
            → Compliance check (LLM) for relevant parts
            → Aggregate → final classification
        → Store RequirementReviewModel
    → htmx: updated detail panel + sidebar row (OOB)

10.3 Context Question AI Suggestion¶

User clicks "Suggest" for question
    → HybridSearcher.search (query from question label/help)
    → QuestionAnswerer.answer (LLM structured output)
    → Store suggestion in session.suggestions
User clicks "Accept"
    → Copy to session.context, clear suggestion
    → Re-probe template (new context may change applicable requirements)

11. FILE LOCATION INDEX¶

What	Where
Database models	`vibe/review/models.py`
Requirements loading	`vibe/review/requirements.py::RequirementLoader`
Template probing	`vibe/review/template_functions.py::probe_template_for_requirements`
Document ingestion	`vibe/review/ingestion.py::DocumentIngester`
Document sources	`vibe/review/document_sources.py::DocumentSource, create_document_source`
Parsing pipeline	`vibe/review/parsing/` (see `parsing-pipeline.md`)
Filestore	`vibe/review/filestore.py`
OCR backends	`vibe/review/parsing/extraction/ocr/` (package: backend, extractor, analysis, health)
DOCX→PDF rendering	`vibe/review/docx_converter.py`
Hybrid search	`vibe/review/hybrid_search.py::HybridSearcher`
Retrieval	`vibe/review/retrieval/search.py::ReferenceSearcher`
Reranking	`vibe/review/retrieval/reranker.py::PartReranker`
Example retrieval	`vibe/review/retrieval/example_retriever.py::ExampleRetriever`
Requirement classifier	`vibe/review/classifier.py::RequirementClassifier`
Batch orchestration	`vibe/review/batch.py::BatchClassifier`
Assessment	`vibe/review/assessment.py::AssessmentClassifier`
Question answering	`vibe/review/question_answerer.py::QuestionAnswerer`
Progress base	`vibe/review/progress.py::BaseProgress`
Prompt building	`vibe/review/prompts.py::PromptBuilder`
Prompt templates	`vibe/review/templates/prompts/*.jinja2`
LLM client	`vibe/review/llm.py::BaseLLMClient`
Embedding providers	`vibe/providers/embedding/` (shared)
Rerank providers	`vibe/providers/reranking/` (shared)
ReviewService	`vibe/review/services/review_service.py`
Analytics service	`vibe/review/services/analytics_service.py`
Provider factory	`vibe/review/services/provider_factory.py`
Database utilities	`vibe/review/database.py`
Routes	`vibe/review/web/routes.py`
Interview mode ext.	`vibe/review/interview_mode.py::ReviewModeExtension`
HTML templates	`vibe/review/templates/review/`
Reference linking	`vibe/review/reference_linker.py::ReferenceLinker`
CLI commands	`vibe/review/cli.py`

Document Version: 2.0 Last Updated: 2026-02-02 Notes: Fixed config, method signatures, provider paths, prompt naming; condensed method tables, route tables, data flows; added document_sources, assessment, progress