vibe.review.search

Reference part search using the unified hybrid search abstraction.

Provides search over ReferencePartModel (regulatory sources like DORA, EBA) for finding relevant reference text during compliance reviews.

EmbeddingDimensionMismatchWarning

Warning raised when query and stored embeddings have different dimensions.

SearchResult

A single search result with scoring information.

semantic_rank

semantic_rank: int | None

Alias for embedding_rank (semantic search).

SearchResults

Collection of search results with metadata.

semantic_count

semantic_count: int

Alias for embedding_count (semantic search).

total_candidates

total_candidates: int

Total number of candidates from BM25 and embedding search.

ReferenceSearcher

Hybrid searcher for reference parts (regulatory sources).

Combines BM25 keyword search with embedding similarity search using Reciprocal Rank Fusion (RRF) for ranking.

Usage

searcher = ReferenceSearcher(session) results = searcher.search( "ICT third-party risk management", language="en", top_k=20 )

__init__

__init__(session: Session, embedding_provider: EmbeddingProvider | None = None, rrf_k: int = 60) -> None

Initialize the searcher.

Parameters:
  • session (Session) –

    SQLAlchemy session.

  • embedding_provider (EmbeddingProvider | None, default: None ) –

    Provider for query embeddings.

  • rrf_k (int, default: 60 ) –

    RRF constant (default 60).

search

search(query: str, language: str | None = None, source_id: str | None = None, top_k: int = 20, bm25_weight: float = 0.5, embedding_weight: float = 0.5, bm25_limit: int = 100, embedding_limit: int = 100) -> ReferenceSearchResults

Perform hybrid search for reference parts.

Parameters:
  • query (str) –

    Search query text.

  • language (str | None, default: None ) –

    Filter by language (e.g., "en", "sv").

  • source_id (str | None, default: None ) –

    Filter by source ID (e.g., "dora_2022_2554").

  • top_k (int, default: 20 ) –

    Number of results to return.

  • bm25_weight (float, default: 0.5 ) –

    Weight for BM25 in RRF.

  • embedding_weight (float, default: 0.5 ) –

    Weight for embedding in RRF.

  • bm25_limit (int, default: 100 ) –

    Max BM25 candidates.

  • embedding_limit (int, default: 100 ) –

    Max embedding candidates.

Returns:
  • ReferenceSearchResults

    SearchResults with ranked results.

close

close() -> None

Close owned resources (no-op, kept for API compatibility).

search_references

search_references(query: str, language: str | None = None, source_id: str | None = None, top_k: int = 20) -> ReferenceSearchResults

Perform a one-off reference search.

Create a session and searcher, perform search, and clean up. For multiple searches, use ReferenceSearcher directly.