Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

JensRoland · 2024-10-13T20:23:12Z

Loved to see Moatless Tools used to set a new SoTA on SWE-Bench Lite by using multi-shot (active search).

Read the paper

“Impressively, when running DeepSeek-V2-coder, a small language model with multiple sampling, the model outperformed state-of-the-art models like GPT-4o or Claude 3.5 Sonnet, achieving a new state-of-the-art 56% in SWE-Bench Lite (a benchmark that evaluates a model’s capacity to solve GitHub issues), while these two models, combined, achieved 43% (Mixed models).”

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

JensRoland commented Oct 13, 2024

Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

Comments

JensRoland commented Oct 13, 2024