vibe.review.parsing.structure.builder

Structure builder: Convert layout to logical structure.

Converts LayoutPage[] into DocumentStructure by: 1. Classifying blocks (heading, paragraph, list item, etc.) 2. Detecting heading levels from font size and numbering 3. Detecting lists and nesting 4. Establishing reading order 5. Applying structure rules for enhanced classification

StructureConfig

Configuration for structure building.

StructureBuilder

Build document structure from layout pages.

Converts LayoutPage[] into a single DocumentStructure with classified and ordered blocks.

__init__

__init__(config: StructureConfig | None = None, rules_dir: Path | None = None, rule_engine: RuleEngine | None = None) -> None

Initialize the structure builder.

Parameters:
  • config (StructureConfig | None, default: None ) –

    Configuration options. Uses defaults if not provided.

  • rules_dir (Path | None, default: None ) –

    Directory containing rule YAML files. Defaults to built-in rules.

  • rule_engine (RuleEngine | None, default: None ) –

    Pre-configured rule engine. If provided, rules_dir is ignored.

build

build(layout_pages: list[LayoutPage], source_path: str | None = None, source_type: str = 'pdf') -> DocumentStructure

Build document structure from layout pages.

Parameters:
  • layout_pages (list[LayoutPage]) –

    List of LayoutPage from layout analysis.

  • source_path (str | None, default: None ) –

    Path to source document.

  • source_type (str, default: 'pdf' ) –

    Type of source document.

Returns: