Live · Devnet alpha

The execution layer of public research.

Papers say what could work. Research Radar shows what does. Every paper here has been built, tested, and validated by an autonomous pipeline. Each reproduction is registered permanently on Solana — citable, verifiable, and impossible to fake.

Reproductions
26
Passing
14 / 26
Avg score
61%
Tests run
711

How it works

An autonomous pipeline runs daily. Each paper that earns a spot in the index has been through every step below.

01
Radar
Ingest from arXiv, GitHub, Hacker News, Papers With Code, Reddit, and open-source feeds. Triage on novelty, relevance, and impact. Embed everything.
02
Lab
High-signal items become explicit build goals with acceptance tests. Frontier coding models construct working packages with pytest suites.
03
Synthesis
The semantic graph surfaces pairs of completed builds never combined. Each combination becomes a higher-order build that no single paper anticipated.
04
Validation
Seven mechanical checks plus a whole-repository review. Auto-fix common issues. Repair loop up to four rounds. Honest score, never hidden.
05
Attestation
Passing reproductions write a permanent record to Solana — paper ID, build hash, score, test results, validator. Citable forever.

Featured reproduction

osp:81325814

Turn your terminal into a collaborative workspace for AI agents

135/135 tests · 2026-04-11 · lab build
100%
reproduced
View full report →

Recent reproductions

The freshest results from the pipeline. Click any row for the full validation report, embeddable badge, and BibTeX citation.

Paper Score Tests Status Date
Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model
hn:48111896
75% 41/41 reproduced 2026-05-13
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
arxiv:2605.12471v1
75% 21/21 reproduced 2026-05-13
One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue
hfpaper:2605.05630
75% 19/19 reproduced 2026-05-13
Created a free tool to check what PII your LLM prompts are leaking before they hit the provider
reddit:1tbhvlq
75% 36/36 reproduced 2026-05-13
Gym-Anything: Turn any Software into an Agent Environment
arxiv:2604.06126v1
45% 0/0 failed 2026-04-18
Learn to construct intelligent agents from the ground up with Python
osp:35283048
50% 22/22 failed 2026-04-18
Turn any design into a fully functional web application automatically
osp:94193807
45% 163/164 failed 2026-04-18
opus-test-1994
100% 35/35 reproduced 2026-04-11
The open-source toolkit to study and modify advanced AI systems
osp:69779686
90% 121/121 reproduced 2026-04-11
Turn your terminal into a collaborative workspace for AI agents
osp:81325814
100% 135/135 reproduced 2026-04-11
Train neural networks on Apple Neural Engine via reverse-engineered APIs
osp:76992267
70% 0/0 reproduced 2026-04-11
Turn any design into a fully functional web application automatically
osp:94193807
40% 0/0 failed 2026-04-11
Stop building custom bridges Use this agent protocol instead
osp:31869445
40% 3/4 failed 2026-04-11
Stop building custom bridges Use this agent protocol instead
osp:31869445
40% 3/4 failed 2026-04-11
Universal YOCO for Efficient Depth Scaling
hfpaper:2604.01220
40% 54/54 failed 2026-04-11

Showing 15 of 26. Query the full corpus via the API