"In theory, there is no difference between theory and practice. In practice, there is."
This quote, attributed to various figures including Einstein, Feynman, and Yogi Berra, is a clever way to say you don't know until you try. This sentiment is especially true in the empirical discipline of AI; the arXiv overflows with breakthroughs, though few will translate to your exact setting.
Across the software industry, most experiments don't pan out on the first attempt. Many promising ideas end up buried in experiment backlogs. They’re abandoned not for lack of potential, but because they take real time and engineering effort to test.
No more letting good ideas gather dust. I built a way to break the bottleneck: AI agents that scan the arXiv, adapt research ideas to my codebase, and open a GitHub pull request with a minimal experiment. I call it PapersWithPRs.
Last month, I described an experiment to task my agent with opening a PR to implement new ideas documented in arXiv papers in a branch of my target repo: VQASynth.
Under the hood, I prompt an LLM with code from both the source and target repositories. Then I task the ExperimentOps agent to generate a draft PR for my target repo, including minimal changes to test the idea and documentation to guide reviewers.
I've reused some of the same automation I built for publishing Docker images, but here it's in the service of adapting research to my application.
The ExperimentOps agent proposes dozens of potential improvements for VQASynth, packaged as a ready-to-test GitHub PR.
To further improve PR quality, I can filter recommendations with LLM-as-a-Judge, run agentic test loops inside Docker containers, prioritize my issues, or enforce CONTRIBUTING.md standards.
Check out the documentation for the example draft PR above.
Early results are encouraging: the agents can stub out testable branches in minutes — work that might otherwise sit untouched for weeks.
This is how I’m learning from hundreds of papers and turning the most promising ideas into real, working code.
PapersWithPRs turns research papers into tested code — automatically.
If you’re tired of great ideas sitting idle, let’s change that together.
Visit us at the RemyxAI booth during the AI Agent Builder Summit. Then join us at Experiment 2025 to share and learn from expert experimenters.