karpathy/autoresearch
AI agents running research on single-GPU nanochat training automatically
autoresearch is Andrej Karpathy's framework for letting an AI agent run autonomous ML research overnight on a single GPU. Point Claude Code or Codex at the three-file nanochat-style repo, prompt to start experiments, and wake up to a log of iterations and a better model.
Our Review
autoresearch is Andrej Karpathy's experimental framework for letting an AI agent run autonomous ML research overnight on a single GPU -- 81,700 stars and 11,800 forks in under 3 months as of May 2026. Released March 2026 by karpathy (Tesla AI Director alumnus, OpenAI founding member, creator of nanoGPT and nanochat), it sets up a deliberately tiny three-file project that an agent can iterate on while you sleep.
What autoresearch does:
- •Autonomous overnight research loop the agent modifies train.py, runs a fixed 5-minute training experiment, checks val_bpb (validation bits per byte), keeps or discards, repeats indefinitely.
- •Three-file structure prepare.py (fixed, do not modify), train.py (agent edits the model and training loop), program.md (human edits the agent instructions).
- •nanochat foundation the training code is a simplified single-GPU version of karpathy's nanochat with a real GPT, Muon + AdamW optimizer, and BPE tokenizer.
- •Vocab-size-independent metric val_bpb makes architectural and vocab-size changes fairly comparable across experiments.
- •Agent-agnostic works with Claude Code, Codex, or any agent that can edit files and run shell commands.
- •Markdown-driven "research org code" program.md is essentially a lightweight skill that defines how the agent thinks; iterate on the markdown to improve research velocity.
- •5-minute time budget per experiment wall-clock budget keeps the agent honest and makes comparisons fair regardless of compute setup.
autoresearch ecosystem:
- •nanochat karpathy's full single-GPU ChatGPT training repo that autoresearch is built on top of.
- •nanoGPT karpathy's classic minimal GPT implementation, the conceptual ancestor.
- •uv the Python project manager used for dependency setup and runtime.
- •Claude Code / Codex the agents most commonly used as the autonomous researcher.
Getting started:
Requires a single NVIDIA GPU (tested on H100), Python 3.10+, and uv. Run curl -LsSf https://astral.sh/uv/install.sh | sh, then uv sync, then uv run prepare.py (one-time data + tokenizer prep, ~2 min). Manually run uv run train.py once to verify your setup, then spin up your agent in the repo and prompt: "Have a look at program.md and let's kick off a new experiment."
Limitations:
Requires a real NVIDIA GPU (H100 recommended); a CPU or weak GPU will not run experiments in the 5-minute budget. Repo intentionally minimal -- no fancy logging, no experiment-tracking dashboard, no built-in safety rails. The agent will happily eat compute overnight; budget your spend accordingly. License is not specified at the top level -- review before commercial use. This is research code, not a productized framework.
Our Verdict
autoresearch in 2026 is the most popular demonstration of "have an AI agent run real ML research overnight" on a single GPU. 81,700 stars in 3 months, 11,800 forks, and built by karpathy himself -- which alone makes this required reading for anyone serious about agent-driven AI research.
For Vibe Builders, autoresearch is on the harder end of accessibility. You will need an H100 (or comparable) GPU, comfort with command line, and willingness to run an autonomous agent overnight unsupervised. If those check out, the payoff is real research velocity: your agent runs hundreds of experiments while you sleep.
For Developers and ML researchers, this is the canonical "agent + minimal training repo" pattern. The three-file structure (prepare/train/program) is intentionally small enough to fully understand in an hour and offers a template you can adapt to your own research domain.
Skip autoresearch if you do not have GPU access or a budget for cloud GPU rentals. Skip if you want a productized research platform with tracking, comparisons, and team features -- Weights & Biases, MLflow, or Comet are designed for that. For the underlying training mechanics without the agent layer, study karpathy's nanochat or nanoGPT directly.
Frequently Asked Questions
What is autoresearch?
autoresearch is Andrej Karpathy's experimental framework for letting an AI agent run autonomous ML research overnight on a single GPU. The agent modifies the training code in train.py, runs a 5-minute experiment, evaluates the result, and decides whether to keep or revert. As of May 2026 it has 81,700 GitHub stars and 11,800 forks, all earned in under three months.
How does autoresearch work?
You set up a single-GPU nanochat-style training environment in three files: prepare.py (fixed), train.py (agent modifies), and program.md (human edits to instruct the agent). Point Claude Code or Codex at the repo, prompt it to start experiments, and it iterates autonomously, optimizing for val_bpb (validation bits per byte, lower is better). You wake up to a log of experiments and a better model.
What GPU do I need for autoresearch?
karpathy tested autoresearch on an H100, which is the recommended setup. Any modern NVIDIA GPU with sufficient VRAM (24 GB+ recommended) can work, but the fixed 5-minute experiment budget means weaker GPUs will produce less progress per cycle. CPUs and low-end laptop GPUs are not viable. Cloud GPU rentals on RunPod or Lambda Labs are the cheapest path if you do not own one.
Which AI agents work with autoresearch?
karpathy specifically mentions Claude Code and Codex as compatible agents. Any agent that can edit files, run shell commands, and read the program.md instructions can drive autoresearch. You spin up the agent in the repo (with permissions enabled), point it at program.md, and prompt 'kick off a new experiment.' The agent handles everything from there.
Who is Andrej Karpathy?
Andrej Karpathy is a founding member of OpenAI, former Senior Director of AI at Tesla, and one of the most respected ML educators in the field. He created nanoGPT and nanochat (the conceptual ancestors of autoresearch), publishes the popular Zero to Hero neural networks course on YouTube, and writes deep technical posts at karpathy.ai and on X.
How do I install autoresearch?
Visit the GitHub repository at https://github.com/karpathy/autoresearch for installation instructions.
What license does autoresearch use?
autoresearch uses the unspecified license.
What are alternatives to autoresearch?
Explore related tools and alternatives on My AI Guide.
Open source & community-verified
unspecified licensed: free to use in any project, no strings attached. 82,309 developers have starred this, meaning the community has reviewed and trusted it.
Reviewed by My AI Guide for relevance, quality, and active maintenance before listing.