vibe.review.parsing.semantic.nodes

Semantic layer node types.

These represent contract-specific semantics: - SemanticUnit: A numbered clause or section - Definition: A defined term with its meaning - CrossReference: A reference to another part of the document - Party: A contract party or signatory - SemanticDocument: Complete semantic analysis

SemanticUnit

A semantic unit in the document (clause, section, etc.).

This represents a numbered or titled section of the contract with its hierarchical position and content. This is the canonical in-memory representation that converts to DocumentPartModel for persistence.

Attributes:
  • number (str) –

    The clause number (e.g., "1.2.3").

  • title (str) –

    The title/heading text (if any).

  • content (str) –

    The full text content.

  • part_type (PartType) –

    Type of unit (clause, section, heading, etc.).

  • level (int) –

    Depth in hierarchy (0 = top level).

  • parent_id (str | None) –

    ID of parent unit.

  • children_ids (list[str]) –

    IDs of child units.

  • source_block_ids (list[str]) –

    IDs of source StructuredBlocks.

text

text: str

Get text content for rule matching (alias for content).

block_type

block_type: str

Get block type for rule matching (returns part_type value).

full_reference

full_reference: str

Get full reference string (e.g., 'Clause 1.2.3').

to_dict

to_dict() -> dict[str, Any]

Convert to dictionary.

to_db_fields

to_db_fields() -> dict[str, Any]

Extract fields for DocumentPartModel creation.

Returns a dict suitable for creating or updating a DocumentPartModel. Pipeline-specific fields (children_ids, etc.) are stored in metadata.

Definition

A defined term in the contract.

Attributes:
  • term (str) –

    The defined term.

  • definition (str) –

    The definition text.

  • source_block_id (str | None) –

    ID of source StructuredBlock.

  • references (list[str]) –

    List of SemanticUnit IDs where term is used.

to_dict

to_dict() -> dict[str, Any]

Convert to dictionary.

CrossReference

A cross-reference within the document.

Attributes:
  • text (str) –

    The reference text as it appears.

  • target_id (str | None) –

    ID of the referenced SemanticUnit (if resolved).

  • target_number (str) –

    The referenced number (e.g., "1.2.3").

  • source_unit_id (str | None) –

    ID of the SemanticUnit containing the reference.

  • resolved (bool) –

    Whether the reference was successfully resolved.

to_dict

to_dict() -> dict[str, Any]

Convert to dictionary.

Party

A party to the contract.

Attributes:
  • name (str) –

    Party name.

  • role (str) –

    Role in the contract (e.g., "Buyer", "Seller").

  • defined_as (str) –

    How the party is defined (e.g., "the Company").

  • source_block_id (str | None) –

    ID of source StructuredBlock.

to_dict

to_dict() -> dict[str, Any]

Convert to dictionary.

SemanticDocument

Complete semantic analysis of a document.

Contains all extracted semantic information: units, definitions, cross-references, and parties.

Attributes:
  • units (list[SemanticUnit]) –

    All semantic units (clauses, sections, etc.).

  • definitions (list[Definition]) –

    All defined terms.

  • cross_references (list[CrossReference]) –

    All internal cross-references.

  • parties (list[Party]) –

    All identified parties.

  • title (str | None) –

    Document title.

  • effective_date (str | None) –

    Contract effective date (if found).

  • source_path (str | None) –

    Path to source document.

  • source_type (str) –

    Type of source document.

get_unit

get_unit(unit_id: str) -> SemanticUnit | None

Get a unit by ID.

get_unit_by_number

get_unit_by_number(number: str) -> SemanticUnit | None

Get a unit by its number.

get_definition

get_definition(term: str) -> Definition | None

Get a definition by term (case-insensitive).

get_root_units

get_root_units() -> list[SemanticUnit]

Get top-level units (no parent).

get_children

get_children(unit_id: str) -> list[SemanticUnit]

Get child units of a given unit.

to_outline

to_outline() -> str

Generate a text outline of the document structure.

to_dict

to_dict() -> dict[str, Any]

Convert to dictionary.