Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Simin Chen's picture

3 3 7

Simin Chen

CM

AYouni's profile picture

21world's profile picture

yuvaranianandhan24's profile picture

·

SeekingDream

AI & ML interests

None yet

Organizations

CM 's collections 1

DyCodeEval (ICML 2025) enables dynamic benchmarking for code LLMs. This collection features dynamic HumanEval and MBPP sets generated with Claude 3.5.

CM/Dynamic_HumanEvalZero

Viewer • Updated Jun 23 • 15.7k • 17
CM/Dynamic_MBPP_sanitized

Viewer • Updated Jun 23 • 15.8k • 21
Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

Paper • 2503.04149 • Published Mar 6 • 6

DyCodeEval (ICML 2025) enables dynamic benchmarking for code LLMs. This collection features dynamic HumanEval and MBPP sets generated with Claude 3.5.

CM/Dynamic_HumanEvalZero

Viewer • Updated Jun 23 • 15.7k • 17
CM/Dynamic_MBPP_sanitized

Viewer • Updated Jun 23 • 15.8k • 21
Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

Paper • 2503.04149 • Published Mar 6 • 6

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs