| Rank | Task | Model | Score | Prompt | Author | Date |
|---|---|---|---|---|---|---|
| 1 | GPQA Diamond | Claude 3.5 / GPT-4o | 100% | Prax | Dec 6 | |
| 2 | MATH Hard | Gemini 1.5 / Llama 405B | 96.1% | Prax | Dec 5 | |
| 3 | Codeforces 3500+ | Grok-4 / GPT-4o | 91% | Prax | Dec 5 | |
| 4 | MMLU-Pro | Claude 3.5 / Gemini 1.5 | 92.8% | Prax | Dec 5 | |
| 5 | WebArena Agent | o1 / Auto-GPT | 100% | Prax | Dec 5 | |
| 6 | GPQA Diamond v2 | Claude 3.5 Opus | 91.1% | Prax | Dec 5 | |
| 7 | MATH Olympiad Gold | o1-preview | 98.4% | Prax | Dec 5 | |
| 8 | HumanEval+ Fix | GPT-4o | 99.4% | Prax | Dec 5 | |
| 9 | Long Context 128k | Gemini 1.5 Pro | 100% | Prax | Dec 5 | |
| 10 | Legal Reasoning | Claude 3.5 | 97% | Prax | Dec 5 |
Unlock Pro Vault • 500+ Private Prompts • $299 Lifetime or $49/mo (5-day free trial)
First 100 lifetime spots • New prompts daily
Submit a better prompt → get featured + free lifetime Pro Vault