vibe.review.parsers.dora_parser¶
Parser for DORA (EU Regulation 2022/2554) HTML files from EUR-Lex.
Extracts articles with granular sub-article parts into a structured JSON format suitable for import into the reference_sources system.
Part IDs use language-independent format for direct lookup: - Articles: "art30.3(e)(i)" for maximum granularity - Recitals: "rec42"
Reference format in config.yml: "dora_2022_2554:art30.2(a)"
parse_dora_html ¶
parse_dora_html(html_path: str | Path, language: str = 'en') -> dict[str, Any]
Parse a DORA HTML file and return structured data with granular parts.
| Parameters: |
|
|---|
| Returns: |
|
|---|