Live · Devnet alpha

The execution layer of public research.

Papers say what could work. Research Radar shows what does. Every paper here has been built, tested, and validated by an autonomous pipeline. Each reproduction is registered permanently on Solana — citable, verifiable, and impossible to fake.

Browse reproductions → Read the thesis API

Reproductions

Passing

14 / 26

Avg score

61%

Tests run

711

How it works

An autonomous pipeline runs daily. Each paper that earns a spot in the index has been through every step below.

Radar

Ingest from arXiv, GitHub, Hacker News, Papers With Code, Reddit, and open-source feeds. Triage on novelty, relevance, and impact. Embed everything.

Lab

High-signal items become explicit build goals with acceptance tests. Frontier coding models construct working packages with pytest suites.

Synthesis

The semantic graph surfaces pairs of completed builds never combined. Each combination becomes a higher-order build that no single paper anticipated.

Validation

Seven mechanical checks plus a whole-repository review. Auto-fix common issues. Repair loop up to four rounds. Honest score, never hidden.

Attestation

Passing reproductions write a permanent record to Solana — paper ID, build hash, score, test results, validator. Citable forever.

Featured reproduction

osp:81325814

Turn your terminal into a collaborative workspace for AI agents

135/135 tests · 2026-04-11 · lab build

100%

reproduced

View full report →

Recent reproductions

The freshest results from the pipeline. Click any row for the full validation report, embeddable badge, and BibTeX citation.

Paper	Score	Tests	Status	Date
Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model hn:48111896	75%	41/41	reproduced	2026-05-13
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference arxiv:2605.12471v1	75%	21/21	reproduced	2026-05-13
One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue hfpaper:2605.05630	75%	19/19	reproduced	2026-05-13
Created a free tool to check what PII your LLM prompts are leaking before they hit the provider reddit:1tbhvlq	75%	36/36	reproduced	2026-05-13
Gym-Anything: Turn any Software into an Agent Environment arxiv:2604.06126v1	45%	0/0	failed	2026-04-18
Learn to construct intelligent agents from the ground up with Python osp:35283048	50%	22/22	failed	2026-04-18
Turn any design into a fully functional web application automatically osp:94193807	45%	163/164	failed	2026-04-18
opus-test-1994	100%	35/35	reproduced	2026-04-11
The open-source toolkit to study and modify advanced AI systems osp:69779686	90%	121/121	reproduced	2026-04-11
Turn your terminal into a collaborative workspace for AI agents osp:81325814	100%	135/135	reproduced	2026-04-11
Train neural networks on Apple Neural Engine via reverse-engineered APIs osp:76992267	70%	0/0	reproduced	2026-04-11
Turn any design into a fully functional web application automatically osp:94193807	40%	0/0	failed	2026-04-11
Stop building custom bridges Use this agent protocol instead osp:31869445	40%	3/4	failed	2026-04-11
Stop building custom bridges Use this agent protocol instead osp:31869445	40%	3/4	failed	2026-04-11
Universal YOCO for Efficient Depth Scaling hfpaper:2604.01220	40%	54/54	failed	2026-04-11

Showing 15 of 26. Query the full corpus via the API →