vibe.review.parsing.layout.nodes¶

Layout layer node types.

These represent the physical/visual structure of a document page: - LayoutLine: A horizontal sequence of words - LayoutBlock: A visual block (paragraph, table cell, etc.) - LayoutRegion: A page region (header, footer, column, margin) - LayoutPage: Complete page layout with all blocks and regions

RegionType ¶

Types of page regions.

LayoutLine ¶

A horizontal line of text (sequence of words).

Words are grouped into lines based on vertical position and horizontal proximity. Lines preserve word order and spacing.

Attributes:	`words` (`list[ExtractedWord]`) – The words in this line, ordered left-to-right. `page` (`int`) – 1-based page number. `bbox` (`BBox`) – Bounding box encompassing all words. `baseline` (`float \| None`) – Y-coordinate of text baseline (if known). `line_spacing` (`float \| None`) – Distance to next line (if known).

text ¶

text: str

Get concatenated text of all words.

dominant_font_size ¶

dominant_font_size: float | None

Get the most common font size in this line.

dominant_font_name ¶

dominant_font_name: str | None

Get the most common font name in this line.

is_bold ¶

is_bold: bool

Check if majority of words are bold.

is_italic ¶

is_italic: bool

Check if majority of words are italic.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

LayoutBlock ¶

A visual block of content (paragraph, list item, table, etc.).

Blocks are groups of lines that form a cohesive visual unit. Block classification (heading, paragraph, list) happens in the structure layer; here we just track visual properties.

Attributes:

lines (list[LayoutLine]) –

The lines in this block, ordered top-to-bottom.
page (int) –

1-based page number.
bbox (BBox) –

Bounding box encompassing all lines.
column_index (int) –

Which column this block is in (0-based).
indent_level (int) –

Visual indentation level (0=flush left).
region_type (RegionType) –

The region this block belongs to.

text ¶

text: str

Get concatenated text of all lines, dehyphenating split words.

line_count ¶

line_count: int

Number of lines in this block.

word_count ¶

word_count: int

Total words in this block.

dominant_font_size ¶

dominant_font_size: float | None

Get the most common font size across all lines.

is_single_line ¶

is_single_line: bool

Check if block has only one line.

first_line ¶

first_line: LayoutLine | None

Get the first line of the block.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

LayoutRegion ¶

A rectangular region on a page.

Regions partition the page into semantic areas (header, footer, body columns, margins). Blocks are assigned to regions based on their position.

Attributes:	`region_type` (`RegionType`) – Type of region. `bbox` (`BBox`) – Bounding box of the region. `column_index` (`int`) – For multi-column layouts, which column (0-based).

contains_point ¶

contains_point(x: float, y: float) -> bool

Check if a point is inside this region.

contains_bbox ¶

contains_bbox(bbox: BBox) -> bool

Check if a bounding box is mostly inside this region.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

from_dict ¶

from_dict(d: dict[str, Any]) -> LayoutRegion

Create from dictionary.

LayoutTableCell ¶

A single cell in a layout table.

Attributes:	`text` (`str`) – The text content of the cell. `row_index` (`int`) – Row index (0-based). `column_index` (`int`) – Column index (0-based). `page` (`int`) – 1-based page number. `bbox` (`BBox`) – Bounding box of the cell content.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

LayoutTableRow ¶

A row in a layout table.

Attributes:	`cells` (`list[LayoutTableCell]`) – The cells in this row, ordered by column index. `row_index` (`int`) – Row index (0-based). `is_header` (`bool`) – Whether this is a header row. `page` (`int`) – 1-based page number. `bbox` (`BBox`) – Bounding box encompassing all cells.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

LayoutTable ¶

A table detected in the layout.

Tables are detected when YOLO identifies a "table" region and column boundaries can be determined from text position clustering.

Attributes:

rows (list[LayoutTableRow]) –

The rows in this table, ordered top to bottom.
column_count (int) –

Number of columns detected.
column_boundaries (list[float]) –

X-coordinates of column separators.
has_header (bool) –

Whether a header row was detected.
page (int) –

1-based page number (or starting page for multi-page tables).
bbox (BBox) –

Bounding box of the entire table.
source_element_label (str) –

The doclayout label that triggered detection.

row_count ¶

row_count: int

Number of rows in this table.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.

ColumnDetectionResult ¶

Column detection details for a page.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize column detection results to dictionary.

LayoutPage ¶

Complete layout analysis for a single page.

Contains all blocks and regions detected on the page, plus metadata about the page dimensions and layout characteristics.

Attributes:

page_number (int) –

1-based page number.
width (float) –

Page width in points.
height (float) –

Page height in points.
blocks (list[LayoutBlock]) –

All content blocks, ordered by reading position.
regions (list[LayoutRegion]) –

Page regions (header, footer, columns, etc.).
tables (list[LayoutTable]) –

Detected tables with cell structure.
column_count (int) –

Number of detected columns (1 = single column).
has_header (bool) –

Whether a page header was detected.
has_footer (bool) –

Whether a page footer was detected.
column_detection (ColumnDetectionResult) –

Column detection metrics and boundaries.

header_blocks ¶

header_blocks: list[LayoutBlock]

Get blocks in the header region.

footer_blocks ¶

footer_blocks: list[LayoutBlock]

Get blocks in the footer region.

body_blocks ¶

body_blocks: list[LayoutBlock]

Get blocks in the body region.

get_region ¶

get_region(x: float, y: float) -> LayoutRegion | None

Get the region containing a point.

to_dict ¶

to_dict() -> dict[str, Any]

Convert to dictionary.