Review CLI (Developer)¶
This page documents the developer-facing command layout for the Review extension.
Which CLI?¶
Use vibe review when you are operating a review template or its stored data:
- create and inspect review sessions
- import reference sources and example corpora
- initialize or reset the review database
- rebuild embeddings or maintain render support
Use vibe-dev review when you are debugging or evaluating the Review extension itself:
- inspect parser output and stored document parts
- inspect structured review logs
- benchmark retrieval and assessment quality
- lint or validate golden records
- use developer-only playgrounds and database shells
One command moved out of review entirely:
vibe-dev pdf dump-textfor rawpdftotext-style PDF inspection
Command Groups¶
vibe-dev review is now organized by workflow:
logsFilter and summarize structured review logs.docsInspect stored document parts or round-trip them through JSON.parseDebug parsing output and related layout tooling.evalEvaluate quality against goldens or summarize human agreement with AI suggestions.dbOpen a developer database shell.goldenLint review golden files.labRun interactive playground tools.
Commands¶
Logs¶
Docs¶
vibe-dev review docs inspect 42
vibe-dev review docs export 42 --output doc.json
vibe-dev review docs import doc.json --session 7
Parse¶
vibe-dev review parse doc contract.pdf --level parts
vibe-dev review parse validate contract.pdf --mode structure
vibe-dev review parse yolo-layout contract.pdf
Eval¶
vibe-dev review eval benchmark case.golden-assessment.yml
vibe-dev review eval benchmark -r ../vibe-review-corpus # walk a whole corpus
vibe-dev review eval benchmark case.golden-assessment.yml --evaluation claude
vibe-dev review eval benchmark case.golden-assessment.yml \
--evaluation berget_gpt_oss_120b \
--evaluation berget_gemma4_31b \
--evaluation mistral_small_24b # retrieve once, assess against each model, compare
vibe-dev review eval agreement --template dora-review
benchmarkmeasures retrieval and assessment quality against curated*.golden-assessment.ymlfiles. Accepts one or more files; pass-r/--recursiveto expand directory arguments via**/*.golden-assessment.yml. Use--evaluation <endpoint>to override the LLM endpoint used for the final relevance + compliance calls — any key fromllm_endpointsin the template'sconfig.yml(e.g.claude,local,mock). Repeat--evaluationto benchmark several models against the same retrieved parts in a single run; retrieval executes once and the report adds a side-by-side model-comparison table (aggregate scores plus per-requirement predictions). See Review Benchmark for the golden-assessment schema, scoring semantics, and the full CLI surface.agreementmeasures how often human reviewers agree with AI suggestions in the live review database.
DB¶
Golden¶
vibe-dev review golden lint-parse path/to/corpus
vibe-dev review golden lint-assessment path/to/corpus
lint-parsechecks parsinggolden.ymlfiles.lint-assessmentchecks*.golden-assessment.ymlfiles used by the benchmark harness.
Lab¶
Common Workflows¶
Debug a Parsing Problem¶
- Dump the parser output:
vibe-dev review parse doc contract.pdf --level parts - Compare against the expected artifact:
vibe-dev review parse validate contract.pdf - If needed, inspect the low-level text layer:
vibe-dev pdf dump-text contract.pdf
Debug a Bad Assessment¶
- Summarize the structured logs:
vibe-dev review logs summarize --run <id> - Inspect the stored parts:
vibe-dev review docs inspect <doc-id> - Re-run benchmark cases if the regression should be covered by goldens:
vibe-dev review eval benchmark ...