Reproducibility Index · opus-test-1994

opus-test-1994

Independent reproduction of this paper. Validated on 2026-04-11 via a lab build. Passed on first attempt without repair.

Quality score
100%
Tests passed
35/35
Repair rounds
0
Status
reproduced

Checks

Syntax

syntax_parse blocking All 15 .py files parse cleanly

Imports

import_llm_causal_prober blocking import llm_causal_prober -- OK

Dependencies

dep_metadata major Found pyproject.toml

Tests

pytest_run major pytest: 35 passed, 0 failed, 0 errors (exit 0)

Packaging

pip_installable major pip install -e . --dry-run succeeded

Git State

worktrees_merged major No .worktrees directory
on_main_branch minor Current branch: master

Cleanup

cleanup_ok info No cleanup issues found

Whole-Repo Review

claude_review info Claude review: 6/10 — The codebase implements a functional causal probing pipeline for black-box LLM interpretation using minimal pairs, influ

Citation & embed

Badge

Embeddable SVG, no auth required.

Research Radar reproduction badge
<img src="https://api.research-radar.com/v1/validations/opus-test-1994/badge.svg">

BibTeX

Drop into your .bib file.

Download .bib →