Karpathy's Autoresearch On My AI Polymarket Trading Bot

Video thumbnail for Karpathy's Autoresearch On My AI Polymarket Trading Bot

A developer adapts Andrej Karpathy's open-source autoresearch project into an autonomous loop that iteratively improves a Polymarket arbitrage trading bot, then demonstrates the bot running live and profiting from Bitcoin up/down markets.

Karpathy's Autoresearch Concept

Andrej Karpathy published an autoresearch project on GitHub designed to autonomously train and refine small models. The creator of this video saw the potential to apply the same self-improving loop to a completely different domain: a Polymarket trading bot that hunts for arbitrage on 5-minute Bitcoin up/down prediction markets.

The Trading Autoresearch Loop

The system is built around a continuous agentic loop with several key components:

  • GitHub as evolution engine — The agent works inside a repo, committing code changes for each experiment iteration
  • Training program (markdown file) — A research playbook that defines how experiments are chosen, run, evaluated, and either kept or discarded
  • Strategy code — Adjustable parameters and filters that the agent modifies between runs
  • Polymarket bot — The live environment where each strategy variant is tested over 1-hour windows
  • Score evaluator — Compares each experiment against the current best; keeps improvements, discards regressions

Experiment Lifecycle

  1. Agent reads the training program and proposes a new experiment (e.g., asymmetry filters, spread-relative-to-edge filters)
  2. Strategy code is updated and committed
  3. Bot runs in dry mode for 1 hour
  4. Results are scored — if the experiment beats the current best, it is kept; if unusually strong, a confirmation run is triggered to guard against noisy data
  5. Learnings are appended to the experiment history, giving the agent richer context for the next iteration

Claude Code as the Orchestrator

The entire autonomous loop runs inside Claude Code (with Codex assistance for setup). The agent autonomously starts new experiments, evaluates results, updates code, and commits — no human intervention needed between cycles.

The Arbitrage Strategy

The bot targets the 5-minute Bitcoin up/down market on Polymarket:

  • Buy both "up" and "down" positions when their combined cost is below $1.00 (e.g., $0.49 + $0.50 = $0.99)
  • Since one side always wins and pays $1.00, the $0.01 difference is risk-free profit
  • Win rate is effectively 100% because both sides of the market are covered
  • The autoresearch loop optimizes parameters like fill rate, entry timing, and edge detection to maximize how often profitable trades are executed

Live Trading Results

After selecting the best strategy discovered through dry-mode experiments, the bot was run live with $5 package sizes:

  • Starting balance: $150
  • Completed 5 out of 5 trades successfully
  • All trades resolved as wins (arbitrage guarantees this)
  • One trade landed at a 97-cent combined cost, yielding $0.15 profit on a single trade
  • Total profit over ~20 minutes: approximately $2
  • Ending balance: $152
"We made $2 in like 20 minutes. Five out of five trades. Everything was pretty good and all was arbitrage."

Key Takeaways

  • Autoresearch is domain-agnostic — Karpathy built it for model training, but the same loop works for trading strategy optimization
  • Noisy environments need confirmation runs — The extra validation step prevents overfitting to lucky outcomes in volatile markets
  • Git-based experiment tracking provides a natural audit trail and rollback mechanism for strategy evolution
  • The approach can be adapted to other Polymarket categories or entirely different optimization problems