OpenEvals

community

Activity Feed

AI & ML interests

LLM evaluation

Recent Activity

SaylorTwift new activity 1 day ago

OpenEvals/MuSR:[bot] Conversion to Parquet

SaylorTwift updated a dataset 5 days ago

OpenEvals/MuSR

SaylorTwift updated a dataset 5 days ago

OpenEvals/aime_24

View all activity

OpenEvals 's Spaces 9

199

Evaluation Guidebook

📝

Display benchmark evaluation data for LLMs

Benchmark Finder

📚

A space to view and inspect all the tasks in lighteval

125

Find a leaderboard

🔍

Explore and discover all leaderboards from the HF community

README

⚖

Aa Omniscience

🐠

Display and inspect log files

InferenceProviderTestingBackend

📈

Launch and monitor model evaluation jobs

Evals

🐨

Run your LLM evaluations on the hub

🐢

Generate a command to run model evaluations

Tokenizers Languages

🐠

Compare tokenization lengths across languages