vibe.embedding_providers.local

Local embedding provider using fastembed.

Runs embedding models locally using ONNX runtime. Supports GPU acceleration via OpenVINO when available. Much lighter than sentence-transformers (~200MB vs ~2GB).

LocalEmbeddingProvider

Local embedding provider using fastembed (ONNX-based).

Uses paraphrase-multilingual-MiniLM-L12-v2 by default (384 dimensions), which provides good multilingual support (~50 languages) with fast CPU inference.

Configuration

model: Model preset or fastembed model ID Presets: multilingual (default), multilingual-large, e5-large, bge-small, bge-base, minilm dimension: Override dimension (usually auto-detected)

Example

Use default (multilingual)

provider = LocalEmbeddingProvider()

Use larger multilingual model

provider = LocalEmbeddingProvider({"model": "multilingual-large"})

embed

embed(text: str) -> list[float]

Generate embedding for a single text.

embed_batch

embed_batch(texts: list[str]) -> list[list[float]]

Generate embeddings for multiple texts.

get_onnx_providers

get_onnx_providers(gpu: bool = True) -> Sequence[str | tuple[str, dict[str, Any]]] | None

Get optimal ONNX execution providers.

Prefers OpenVINO GPU when available for ~3x speedup on Intel iGPUs. Falls back to CPU if GPU not available or disabled.

Parameters:
  • gpu (bool, default: True ) –

    Whether to try GPU acceleration. Default True.

Returns:
  • Sequence[str | tuple[str, dict[str, Any]]] | None

    List of providers for fastembed, or None to use defaults.