Review Module Architecture¶
Architectural reference for VIBE's document compliance review system. Optimized for LLM consumption.
See also:
core.md- Core VIBE enginecomponents.md- Component system detailsassistant.md- AI-assisted interview systemparsing-pipeline.md- Document parsing pipeline (4-layer architecture)
1. SYSTEM OVERVIEW¶
The Review module enables AI-assisted document compliance review against structured requirement frameworks. Unlike VIBE's core interview mode (which generates documents), Review mode analyzes uploaded documents against predefined requirements (e.g., DORA ICT contract compliance).
1.1 Core Philosophy¶
- A review template consists of requirements plus a reporting template (
template.md/template.docx) - Requirements use VIBE Core patterns: only relevant requirements need assessment (determined by template probing)
- Review is a template capability (
interview_mode: review), not a standalone module - Entry point:
/interview/<template_id>/redirects to/review/<template_id>/sessions
1.2 Key Capabilities¶
- Document Processing: Multi-document upload (Markdown, DOCX, PDF); documents segmented into parts with hierarchy and stable IDs (content-hash based)
- Hybrid Search: BM25 keyword + semantic embeddings retrieve candidate parts per requirement
- Two-Stage LLM Classification: Relevance filtering, then compliance evaluation (YES/NO/PARTIAL)
- Human-in-the-Loop: AI suggests classifications; humans verify/override; verified reviews promote to few-shot examples
- Output: Excel export (any state) and template-based report generation (finished reviews)
2. DATA MODEL¶
2.1 Database Models¶
Location: vibe/review/models.py
| Model | Purpose |
|---|---|
ReviewSessionModel |
Session linking template + documents (status: pending → processing → ready) |
DocumentModel |
Uploaded document metadata and status |
DocumentPartModel |
Document segments with embeddings, stable IDs, hierarchy metadata |
RequirementReviewModel |
Classification result per requirement (result, confidence, reasoning, supporting parts) |
QuestionReviewModel |
AI/human answers for template questions |
ExampleModel |
Few-shot examples for LLM classification |
RequirementCacheModel |
Cached requirement definitions with embeddings |
ReferenceSourceModel |
Regulatory source documents (DORA, EBA, etc.) |
ReferencePartModel |
Segments of regulatory sources |
Classification Results: YES, NO, PARTIAL, NOT_APPLICABLE, PENDING
2.2 Requirements Definition¶
Location: vibe/review/requirements.py
Requirements defined in template config.yml under unified groups format with context questions and requirements co-located:
groups:
audit_rights:
title: Audit Rights
questions:
critical_service: { type: bool, label: Critical or Important Function }
requirements:
D5-1:
label: Unlimited audit rights
description: Full access to inspect ICT service provider
Template-Driven Applicability: {% if critical_service %}{{ req('D5-1') }}{% endif %}
Key Classes: RequirementLoader, RequirementSet, Requirement, probe_template_for_requirements()
3. DOCUMENT INGESTION¶
Location: vibe/review/ingestion.py, vibe/review/document_sources.py
Two-Phase Process:
- Upload (
create_session) -- Store documents in filestore, create session withstatus=pending - Ingest (
stream_session_ingestion) -- Parse documents, compute embeddings via SSE stream
Document Sources: The DocumentSource protocol (document_sources.py) provides per-format abstraction with MarkdownSource, PdfSource, DocxSource implementations. Factory: create_document_source().
Ingestion Pipeline:
1. Detect content type → create DocumentSource
2. Route to Parsing Pipeline (see parsing-pipeline.md)
3. Convert SemanticUnits to DocumentPartModels
4. Detect language (for multilingual embedding/prompts)
5. Generate embeddings (batch)
6. Store parts with metadata (hierarchy, bounding boxes)
Filestore: (vibe/review/filestore.py) Content-addressed binary storage. store_bytes(data, suffix) → (sha256, path). Documents reference binaries via doc_metadata.filestore.sha256.
4. CONFIGURATION¶
The Review module resolves configuration from the app_config mapping passed to ReviewService and ReviewProviderFactory:
- Database:
DATABASE_URLfrom app config or environment - OCR:
review_backends.ocrin app config (backend, dpi, text_layer_min_chars) - DOCX Conversion:
review_backends.docxin app config (backend) - LLM/Embedding/Rerank: Resolved via
ReviewProviderFactoryfrom template-level config - Caching: OCR, PDF, and DOCX render caches via helper functions
5. CLASSIFICATION PIPELINE¶
5.1 Retrieval¶
Location: vibe/review/retrieval/, vibe/review/hybrid_search.py
1. Generate query from requirement (label + description + help text)
2. BM25 + Embedding similarity search (HybridSearcher with RRF fusion)
3. Cross-encoder reranking (Reranker) for precision
5.2 Two-Stage LLM Classification¶
Location: vibe/review/classifier.py::RequirementClassifier
Stage 1 -- Relevance: For each retrieved part, LLM determines R (relevant) or N (not relevant). Stage 2 -- Compliance: For each relevant part, LLM evaluates YES/NO/PARTIAL with confidence and reasoning. Results aggregated for final classification.
5.3 Batch Classification¶
Location: vibe/review/batch.py::BatchClassifier
Orchestrates classification of all applicable requirements via SSE stream: load applicable requirements (template probe) → for each: retrieve → classify → store → emit progress.
5.4 Unified Assessment¶
Location: vibe/review/assessment.py
AssessmentClassifier unifies question-answering and requirement-classification pipelines. AssessmentItem and AssessmentResult provide a common interface for both assessment types.
5.5 Few-Shot Examples¶
Location: vibe/review/retrieval/example_retriever.py
ExampleRetriever finds similar examples from database. Users can promote matched document parts to examples from the workbench, creating a feedback loop where human-verified classifications become training data.
6. PROMPT SYSTEM¶
Location: vibe/review/templates/prompts/
Naming Convention: {type}_{role}_{lang}.jinja2
- Types:
relevance,compliance,batch,question - Roles:
system,user - Languages:
en,sv,de,es,fr - Partials prefixed with
_(e.g.,_requirement_info_en.jinja2)
Prompt Building: vibe/review/prompts.py::PromptBuilder constructs prompts with requirement details, document context, few-shot examples, and output format instructions.
LLM Client: vibe/review/llm.py::BaseLLMClient -- Separate from vibe.providers.llm by design (review needs structured JSON output, not streaming).
7. SERVICE LAYER¶
7.1 ReviewService¶
Location: vibe/review/services/review_service.py::ReviewService
Facade providing stable interface for web routes. Instantiated per-request with db_session, template_provider, and app_config: Mapping[str, Any] | None (decoupled from Flask globals).
Responsibility Areas:
- Session lifecycle:
list_sessions(limit),get_session,create_session,delete_session,stream_session_ingestion - Requirements:
get_template_requirement_set,load_session_requirements,get_requirement_groups,get_classification_stats - Classification:
classify_single_requirement,stream_ai_assessment,stream_batch_classification,save_human_classification - Questions:
get_template_questions,suggest_question_answer,save_human_question_answer - Documents:
get_matched_parts,render_document_html,open_document_binary_for_download,export_results_xlsx - Examples:
list_examples,get_example,update_example,delete_example - Reports:
render_report,build_template_context
Composed Services: ReviewAnalyticsService (vibe/review/services/analytics_service.py) -- accuracy calculation from human override patterns.
7.2 Provider Factory¶
Location: vibe/review/services/provider_factory.py::ReviewProviderFactory
Creates embedding, reranking, and LLM providers based on template config. OCR backend creation happens in ReviewService._create_cached_ocr_backend().
7.3 Progress Reporting¶
Location: vibe/review/progress.py
BaseProgress base class for all SSE progress updates in long-running operations (ingestion, classification).
8. WEB INTERFACE¶
Location: vibe/review/web/routes.py (Blueprint under /review/)
Workbench Layout: Three-pane UI:
- Left sidebar: Assessment items (questions + requirements) with status badges
- Center: Document viewer with part highlighting and navigation
- Right panel: Item detail with classification form or question input
htmx Interactions: Click item → loads detail panel; "Run AI" → classification via OOB swap; save → updates sidebar row; document part links → scroll viewer.
Templates: 12 top-level templates + 20 partials in vibe/review/templates/review/.
9. ENTRY FLOW & URL STRUCTURE¶
Interview Mode Extension: vibe/review/interview_mode.py::ReviewModeExtension intercepts interview_mode: review templates and redirects to review UI.
URL Pattern: All routes include <template_id> for template-scoped review:
/review/ # Global index
/review/<template_id>/sessions # Sessions list
/review/<template_id>/new # Upload form (GET/POST)
/review/<template_id>/<session_id>/ingest # Ingestion progress (SSE)
/review/<template_id>/<session_id> # Workbench
/review/<template_id>/<session_id>/item/... # Assessment item partials
/review/<template_id>/<session_id>/requirement/...# Classification endpoints
/review/<template_id>/<session_id>/export # Excel export
10. DATA FLOW DIAGRAMS¶
10.1 New Review Session¶
User uploads document(s) via /review/<template_id>/new
→ Store binaries in filestore
→ Create ReviewSessionModel (status=pending) + DocumentModels
→ Redirect to ingest route
SSE ingestion stream:
→ For each document:
→ create_document_source() → DocumentSource
→ Parse via pipeline, segment into parts with stable IDs
→ Detect language, compute embeddings (batch)
→ SSE progress events
→ Update session status to ready
→ SSE: complete (redirect to workbench)
10.2 Single Requirement Classification¶
User clicks "Run AI" for requirement D5-1
→ ReviewService.classify_single_requirement
→ HybridSearcher.search (BM25 + semantic) → Reranker.rerank
→ RequirementClassifier.classify:
→ Relevance check (LLM) per candidate part
→ Compliance check (LLM) for relevant parts
→ Aggregate → final classification
→ Store RequirementReviewModel
→ htmx: updated detail panel + sidebar row (OOB)
10.3 Context Question AI Suggestion¶
User clicks "Suggest" for question
→ HybridSearcher.search (query from question label/help)
→ QuestionAnswerer.answer (LLM structured output)
→ Store suggestion in session.suggestions
User clicks "Accept"
→ Copy to session.context, clear suggestion
→ Re-probe template (new context may change applicable requirements)
11. FILE LOCATION INDEX¶
| What | Where |
|---|---|
| Database models | vibe/review/models.py |
| Requirements loading | vibe/review/requirements.py::RequirementLoader |
| Template probing | vibe/review/template_functions.py::probe_template_for_requirements |
| Document ingestion | vibe/review/ingestion.py::DocumentIngester |
| Document sources | vibe/review/document_sources.py::DocumentSource, create_document_source |
| Parsing pipeline | vibe/review/parsing/ (see parsing-pipeline.md) |
| Filestore | vibe/review/filestore.py |
| OCR backends | vibe/review/parsing/extraction/ocr/ (package: backend, extractor, analysis, health) |
| DOCX→PDF rendering | vibe/review/docx_converter.py |
| Hybrid search | vibe/review/hybrid_search.py::HybridSearcher |
| Retrieval | vibe/review/retrieval/search.py::ReferenceSearcher |
| Reranking | vibe/review/retrieval/reranker.py::PartReranker |
| Example retrieval | vibe/review/retrieval/example_retriever.py::ExampleRetriever |
| Requirement classifier | vibe/review/classifier.py::RequirementClassifier |
| Batch orchestration | vibe/review/batch.py::BatchClassifier |
| Assessment | vibe/review/assessment.py::AssessmentClassifier |
| Question answering | vibe/review/question_answerer.py::QuestionAnswerer |
| Progress base | vibe/review/progress.py::BaseProgress |
| Prompt building | vibe/review/prompts.py::PromptBuilder |
| Prompt templates | vibe/review/templates/prompts/*.jinja2 |
| LLM client | vibe/review/llm.py::BaseLLMClient |
| Embedding providers | vibe/providers/embedding/ (shared) |
| Rerank providers | vibe/providers/reranking/ (shared) |
| ReviewService | vibe/review/services/review_service.py |
| Analytics service | vibe/review/services/analytics_service.py |
| Provider factory | vibe/review/services/provider_factory.py |
| Database utilities | vibe/review/database.py |
| Routes | vibe/review/web/routes.py |
| Interview mode ext. | vibe/review/interview_mode.py::ReviewModeExtension |
| HTML templates | vibe/review/templates/review/ |
| Reference linking | vibe/review/reference_linker.py::ReferenceLinker |
| CLI commands | vibe/review/cli.py |
Document Version: 2.0 Last Updated: 2026-02-02 Notes: Fixed config, method signatures, provider paths, prompt naming; condensed method tables, route tables, data flows; added document_sources, assessment, progress