opus-test-1994
Independent reproduction of this paper. Validated on via a lab build. Passed on first attempt without repair.
Quality score
100%
Tests passed
35/35
Repair rounds
0
Status
reproduced
Checks
Syntax
| ✓ | syntax_parse | blocking | All 15 .py files parse cleanly |
Imports
| ✓ | import_llm_causal_prober | blocking | import llm_causal_prober -- OK |
Dependencies
| ✓ | dep_metadata | major | Found pyproject.toml |
Tests
| ✓ | pytest_run | major | pytest: 35 passed, 0 failed, 0 errors (exit 0) |
Packaging
| ✓ | pip_installable | major | pip install -e . --dry-run succeeded |
Git State
| ✓ | worktrees_merged | major | No .worktrees directory |
| ✓ | on_main_branch | minor | Current branch: master |
Cleanup
| ✓ | cleanup_ok | info | No cleanup issues found |
Whole-Repo Review
| ✓ | claude_review | info | Claude review: 6/10 — The codebase implements a functional causal probing pipeline for black-box LLM interpretation using minimal pairs, influ |
Citation & embed
Badge
<img src="https://api.research-radar.com/v1/validations/opus-test-1994/badge.svg">