vibe.review.parsing.rules.predicates¶

Predicate evaluation for the rules engine using Python expressions.

Predicates are Python expressions evaluated against nodes. They support full Python syntax including boolean operators, comparisons, and function calls.

Examples: - "node.is_bold" - "len(node.text) < 100" - "node.is_bold and len(node.text) < 50" - "'definition' in node.text.lower()" - "re.search(r'^\d+\.', node.text)" - "node.bbox.width > 400" - "has_verb(node.text)" # Uses spaCy for verb detection - "previous() and previous().is_heading" # Check previous node in list - "next() and 'continued' in next().text.lower()" # Check next node

Context available in predicates: - node: The node being evaluated - ctx: Additional context (page dimensions, helper functions) - re: The re module for regex operations

Built-in predicate functions: - has_verb(text, language="sv"): Check if text contains a verb (requires spaCy)

List traversal functions (when rule engine provides traversal context): - next(): Returns the next node in the list, or None if at the end - previous(): Returns the previous node in the list, or None if at the start - node_index(): Returns the 0-based index of the current node - nodes_count(): Returns the total number of nodes in the list

CompiledPredicate ¶

A compiled predicate expression ready for evaluation.

Attributes:	`expression` (`str`) – Original expression string. `code` (`CodeType \| None`) – Compiled Python code object. `is_function_ref` (`bool`) – True if this is a predicate_function reference. `function_name` (`str \| None`) – Name of the function (if is_function_ref).

PredicateEvaluator ¶

Evaluate Python expression predicates against nodes.

Supports: - Full Python expression syntax - Boolean operators (and, or, not) - Comparisons (==, !=, <, <=, >, >=, in, not in) - String methods (.startswith(), .endswith(), .lower(), etc.) - Regex via re module (re.search(), re.match()) - Nested attribute access (node.bbox.width) - predicate_function references to external Python functions

init ¶

__init__(context: dict[str, Any] | None = None, predicate_functions_dir: Path | None = None) -> None

Initialize the evaluator.

Parameters:	`context` (`dict[str, Any] \| None`, default: `None` ) – Additional context variables available as `ctx` in predicates. `predicate_functions_dir` (`Path \| None`, default: `None` ) – Directory containing predicates.py for function loading.

set_traversal_context ¶

set_traversal_context(nodes: Sequence[Any] | None, current_idx: int = -1) -> None

Set the list context for next()/previous() traversal functions.

This should be called before evaluating predicates when processing a list of nodes, to enable predicates like: - "previous() and previous().is_heading" - "next() and 'continued' in next().text.lower()"

Parameters:	`nodes` (`Sequence[Any] \| None`) – The list of nodes being processed, or None to clear. `current_idx` (`int`, default: `-1` ) – Index of the current node in the list.

clear_traversal_context ¶

clear_traversal_context() -> None

Clear the traversal context.

parse ¶

parse(expression: str) -> CompiledPredicate | None

Parse and compile a predicate expression string.

This is the main entry point for compiling predicates from YAML.

Parameters:	`expression` (`str`) – Python expression like "node.is_bold and len(node.text) < 100".

Returns:	`CompiledPredicate \| None` – CompiledPredicate object or None if compilation fails.

compile ¶

compile(expression: str) -> CompiledPredicate

Compile and validate a predicate expression.

Parameters:	`expression` (`str`) – Python expression string.

Returns:	`CompiledPredicate` – CompiledPredicate ready for evaluation.

Raises:	`ValueError` – If expression is invalid or contains unsafe constructs.

compile_function_ref ¶

compile_function_ref(function_name: str) -> CompiledPredicate

Create a predicate that references an external function.

Parameters:	`function_name` (`str`) – Name of the function in predicates.py.

Returns:	`CompiledPredicate` – CompiledPredicate with function reference.

evaluate ¶

evaluate(predicate: CompiledPredicate, node: object) -> bool

Evaluate a compiled predicate against a node.

Parameters:	`predicate` (`CompiledPredicate`) – The compiled predicate to evaluate. `node` (`object`) – The node to test.

Returns:	`bool` – True if predicate matches, False otherwise.

evaluate_all ¶

evaluate_all(predicates: list[CompiledPredicate], node: object) -> bool

Evaluate all predicates (AND logic).

Returns True only if all predicates match.

evaluate_any ¶

evaluate_any(predicates: list[CompiledPredicate], node: object) -> bool

Evaluate predicates with OR logic.

Returns True if any predicate matches.

has_verb ¶

has_verb(text: str, language: str = 'sv') -> bool

Check if text contains a verb using spaCy POS tagging.

This is useful for distinguishing headings (noun phrases, typically no verbs) from paragraph starts or sentences (which contain verbs).

Can be used in predicate expressions

"has_verb(node.text)" "not has_verb(node.text) and node.is_bold"

Parameters:	`text` (`str`) – The text to analyze. `language` (`str`, default: `'sv'` ) – ISO 639-1 language code (sv, en, de, fr, es).

Returns:	`bool` – True if text contains a verb (VERB or AUX POS tag). `bool` – False if no verb is found OR if spaCy is unavailable.

Note

Returns False (not None) when spaCy is unavailable, making it safe for use in boolean predicate expressions.