vibe.review.batch

Batch classification for review sessions.

Orchestrates the full classification pipeline: 1. For each requirement: a. Retrieve relevant document parts (hybrid search + optional rerank) b. Retrieve few-shot examples c. Classify using LLM 2. Store results in database

The batch classifier is used both from CLI/integration tests and from the Review workbench (SSE progress streaming).

SingleClassificationProgress

Progress information for single-requirement classification stages.

to_sse_data

to_sse_data() -> str

Format as SSE data payload.

QuestionSuggestionProgress

Progress information for question suggestion stages.

to_sse_data

to_sse_data() -> str

Format as SSE data payload.

BatchProgress

Progress information for batch classification.

to_dict

to_dict() -> dict[str, Any]

Override to include batch-specific fields.

BatchResult

Result of batch classification.

to_dict

to_dict() -> dict[str, Any]

Serialize batch result to dictionary with ISO timestamps.

BatchClassifier

Classifies all requirements for a document.

Orchestrates retrieval, example loading, and classification for each requirement in a review session.

Usage

batch = BatchClassifier(session) result = await batch.classify_document( review_session=session, on_progress=lambda p: print(f"{p.current}/{p.total}"), )

__init__

__init__(db_session: Session, llm_client: BaseLLMClient | None = None, embedding_provider: EmbeddingProvider | None = None, rerank_provider: RerankProvider | None = None, search_limit: int | None = None, rerank_top_n: int = 10, max_parts_per_requirement: int = 3, skip_relevance_check: bool = False, collect_timings: bool = False) -> None

Initialize batch classifier.

Parameters:
  • db_session (Session) –

    SQLAlchemy session.

  • llm_client (BaseLLMClient | None, default: None ) –

    LLM client for classification.

  • embedding_provider (EmbeddingProvider | None, default: None ) –

    Provider for embeddings (search). Auto-selects if None.

  • rerank_provider (RerankProvider | None, default: None ) –

    Provider for reranking. Auto-selects if None.

  • search_limit (int | None, default: None ) –

    Max candidates from hybrid search (auto if None).

  • rerank_top_n (int, default: 10 ) –

    Top N after reranking.

  • max_parts_per_requirement (int, default: 3 ) –

    Max parts to classify per requirement.

  • skip_relevance_check (bool, default: False ) –

    Skip relevance step.

  • collect_timings (bool, default: False ) –

    If True, track retrieval timing in _last_retrieval_time_ms.

classify_document

classify_document(review_session: ReviewSessionModel, requirements: list[Any], on_progress: Callable[[BatchProgress], None] | None = None) -> BatchResult

Run classification for all requirements.

Parameters:
  • review_session (ReviewSessionModel) –

    The review session to classify.

  • requirements (list[Any]) –

    List of requirements to evaluate.

  • on_progress (Callable[[BatchProgress], None] | None, default: None ) –

    Callback for progress updates.

Returns:

iter_classify_document

iter_classify_document(review_session: ReviewSessionModel, requirements: list[Any]) -> Iterator[BatchProgress]

Classify all requirements, yielding progress updates after each.

get_last_result

get_last_result() -> BatchResult | None

Return the result from the most recent classify_document run.

iter_classify_single

iter_classify_single(review_session: ReviewSessionModel, requirement: Requirement) -> Iterator[SingleClassificationProgress]

Classify a single requirement with progress updates at each stage.

Yields SingleClassificationProgress at each stage: - "searching": Starting hybrid search - "reranking": Found candidates, starting rerank - "assessing": Ranked parts, starting LLM assessment - "complete": Classification done (includes result) - "error": An error occurred

Parameters:
Yields:

close

close() -> None

Close owned resources.