☕︎ DomainEval ☕️

An Auto-Constructed Benchmark for Multi-Domain Code Generation

github paper space data
# Model Pass
# Model Pass

📝 Submission

Thank you for your interest in DomainEval. We warmly welcome researchers to submit additional benchmarking results, as we believe that collaborative efforts can significantly advance the study of Large Language Models and software engineering. For submission guidelines, please refer to our Github Repo.

🤗 Acknowledgement

Thanks for the EvalPlus for sharing the leaderboard template. In addition to DomainEval leaderboards, it is recommended to comprehensively understand LLM coding ability through a diverse set of benchmarks and leaderboards, such as: