Skip to content

openai/human-eval

Official

Code for the paper "Evaluating Large Language Models Trained on Code"

3,205 stars441 forksPythonUpdated January 2025

Best for

Developer
✅ Reviewed by My AI Guide — vetted for vibe builders

Our Review

  • Standard benchmark for code models
  • Sandboxed execution for safety
  • Easy integration for model eval

Cons

  • Requires enabling execution carefully
  • Limited to 164 hand-written problems
  • Python 3.7+ only

Our Verdict

Essential tool for evaluating code LLMs; straightforward but use sandbox wisely.

Frequently Asked Questions

What is human-eval?

Code for the paper "Evaluating Large Language Models Trained on Code"

How do I install human-eval?

Visit the GitHub repository at https://github.com/openai/human-eval for installation instructions.

What license does human-eval use?

human-eval uses the MIT license.

What are alternatives to human-eval?

Explore related tools and alternatives on My AI Guide.

Great for: Pro Vibe Builders

Skip if: You need something more beginner-friendly or guided

🔒

Open source & community-verified

MIT licensed — free to use in any project, no strings attached. 3,205 developers have starred this, meaning the community has reviewed and trusted it.

Reviewed by My AI Guide for relevance, quality, and active maintenance before listing.