One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Independent reproduction of this paper. Validated on 2026-05-13 via a lab build. Passed on first attempt without repair.

Quality score

75%

Tests passed

19/19

Repair rounds

Status

reproduced

Checks

Syntax

✓ syntax_parse blocking All 10 .py files parse cleanly

Imports

✓ import_dialogue_rollback blocking import dialogue_rollback -- OK

Dependencies

✓ dep_metadata major Found pyproject.toml

Tests

✓ pytest_run major pytest: 19 passed, 0 failed, 0 errors (exit 0)

Packaging

✓ pip_installable major pip install -e . --dry-run succeeded

Git State

✓	`worktrees_merged`	major	No .worktrees directory
✓	`on_main_branch`	minor	Current branch: master

Cleanup

✗ no_venv_in_tree major .venv/ present (4977 MB) -- should be in .gitignore

Whole-Repo Review

✓ claude_review info Claude review: 5/10 — The codebase presents a coherent architecture for detecting and rolling back harmful dialogue turns, but contains critic

Citation & embed

Badge

Embeddable SVG, no auth required.

<img src="https://api.research-radar.com/v1/validations/b69e4551/badge.svg">

BibTeX

Drop into your .bib file.

Download .bib →