vibe.review.parsing.ir¶
Intermediate Representation (IR) for the parsing pipeline.
Provides traceability through: - Span model: every node carries source location references - Provenance log: transformation events with rule, inputs, outputs, confidence - Stable IDs: deterministic hashing with content + position + ancestor context
Design based on designdocs/parsing-architecture.md section 2.
BBox ¶
Bounding box in points (72 points = 1 inch).
Span ¶
ProvenanceEvent ¶
Transformation audit trail entry.
Records which rule fired, what inputs it consumed, what outputs it produced, and the confidence level. Used for debugging and understanding why the parser made specific decisions.
IRNode ¶
Base class for all IR nodes in the parsing pipeline.
Every node has: - A stable, deterministic ID for diffing and citation - Source spans for traceability back to the original document - Provenance log of transformations that created/modified it - Arbitrary metadata for layer-specific information
add_provenance ¶
add_provenance(event: ProvenanceEvent) -> None
Add a provenance event to this node's history.
generate_stable_id ¶
generate_stable_id(content: str, page: int, position_hint: tuple[float, float] | None = None, ancestor_ids: list[str] | None = None, node_type: str = 'node') -> str
Generate a deterministic ID for stable diffing and citation.
The ID is based on: - Content hash (primary disambiguation) - Position hint (page, x0, y0) for same-content disambiguation - Ancestor context (last 3 ancestors) for hierarchical uniqueness
Format: {type_prefix}-{content_hash}-{position_hash} Example: "par-a1b2c3d4-e5f6"
| Parameters: |
|
|---|
| Returns: |
|
|---|
merge_provenance ¶
merge_provenance(nodes: list[IRNode], rule_id: str, rule_name: str, layer: str) -> list[ProvenanceEvent]
Create provenance events for a merge operation.
When multiple nodes are merged into one, this creates the appropriate provenance tracking.
| Parameters: |
|---|
| Returns: |
|
|---|