vibe.review.parsing.layout.analyzer¶
Layout analysis: Convert extracted words into layout structure.
This analyzer: 1. Groups words into lines based on vertical alignment 2. Groups lines into blocks based on spacing and alignment 3. Detects page regions (header, footer, body, columns) 4. Identifies repeated headers/footers across pages
LayoutConfig ¶
Configuration for layout analysis.
LayoutAnalyzer ¶
Analyze document layout from extracted words.
Converts ExtractedWord[] into LayoutPage[] with hierarchical structure: Page → Region → Block → Line → Word.
__init__ ¶
__init__(config: LayoutConfig | None = None, rules_dir: Path | None = None, rule_engine: RuleEngine | None = None, doclayout_detector: YoloLayoutDetector | None = None, table_structure_detector: TableStructureDetector | None = None) -> None
Initialize the layout analyzer.
| Parameters: |
|
|---|
analyze ¶
analyze(extraction: ExtractionResult, page_progress: Callable[[int, int], None] | None = None) -> list[LayoutPage]
Analyze layout of all pages.
| Parameters: |
|
|---|
| Returns: |
|
|---|
extract_table_from_words ¶
extract_table_from_words(words: list[ExtractedWord], page_num: int, table_bbox: BBox | None = None, element_label: str = 'table') -> LayoutTable | None
Extract table structure from words using position-based analysis.
This method detects table columns by clustering text x-coordinates, groups text into rows by y-coordinate, and merges continuation rows where the first column (typically a row identifier) is empty.
| Parameters: |
|
|---|
| Returns: |
|
|---|