vibe.review.parsing.layout.table_structure

Microsoft Table Transformer structure recognition for table cell detection.

This module provides inference wrappers for the Table Transformer model that detects rows, columns, and cells within table regions.

TableElementType

Types of elements detected by Table Transformer.

TableStructureElement

Single table structure detection result.

to_dict

to_dict() -> dict[str, object]

Serialize element to dictionary.

from_dict

from_dict(data: dict[str, Any]) -> TableStructureElement

Reconstruct element from dictionary.

TableStructureResult

Table structure detection results for a single table region.

row_count

row_count: int

Number of rows detected.

column_count

column_count: int

Number of columns detected.

to_dict

to_dict() -> dict[str, object]

Serialize results to dictionary.

from_dict

from_dict(data: dict[str, Any]) -> TableStructureResult

Reconstruct results from dictionary.

get_sorted_rows

get_sorted_rows() -> list[TableStructureElement]

Get rows sorted by y-coordinate (top to bottom).

get_sorted_columns

get_sorted_columns() -> list[TableStructureElement]

Get columns sorted by x-coordinate (left to right).

TableStructureConfig

Configuration for Table Transformer inference.

TableStructureDetector

Microsoft Table Transformer structure recognition with caching.

__init__

__init__(config: TableStructureConfig | None = None, cache_dir: Path | None = None) -> None

Initialize the detector.

detect_table_structure

detect_table_structure(pdf_path: Path, page_info: PageInfo, table_bbox: BBox) -> TableStructureResult | None

Detect table structure within a table region.

Parameters:
  • pdf_path (Path) –

    Path to the PDF file.

  • page_info (PageInfo) –

    Page information (dimensions, page number).

  • table_bbox (BBox) –

    Bounding box of the table region (from YOLO).

Returns:

create_detector

create_detector(config: TableStructureConfig | None = None, cache_dir: Path | None = None) -> TableStructureDetector

Create a Table Structure detector.

Parameters:
  • config (TableStructureConfig | None, default: None ) –

    Detector configuration.

  • cache_dir (Path | None, default: None ) –

    Cache directory for results.

Returns: