22 86 95

Asaf Yehudai

Asaf-Yehudai

AI & ML interests

None yet

Recent Activity

liked a dataset 6 days ago

Exgentic/agent-llm-traces

upvoted a paper 16 days ago

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

upvoted a paper 16 days ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

View all activity

Organizations

New activity in evaleval/EEE_datastore 21 days ago

[Submission] HAL Leaderboard - 9 agentic benchmarks (246 entries)

#80 opened 25 days ago by

Asaf-Yehudai

New activity in lmarena-ai/arena-human-preference-140k 8 months ago

Missing models compared to the Arena-Hard-v2.0-Preview

#2 opened 8 months ago by

Asaf-Yehudai

New activity in gaia-benchmark/leaderboard 9 months ago

Access to the submission and evaluation data

#69 opened 9 months ago by

Asaf-Yehudai

commented a paper 10 months ago

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Paper • 2507.18392 • Published Jul 24, 2025 • 20 •

commented 5 papers about 1 year ago

commented a paper over 1 year ago

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models

Paper • 2502.08130 • Published Feb 12, 2025 • 9 •

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 almost 2 years ago

Problem running the model

#1 opened almost 2 years ago by

Asaf-Yehudai

New activity in mistralai/Mixtral-8x22B-Instruct-v0.1 about 2 years ago

MT-Bench Results

👍 5

#8 opened about 2 years ago by

0-hero

New activity in Nexusflow/Starling-RM-34B about 2 years ago

Bug with example code:

#1 opened about 2 years ago by

Asaf-Yehudai

New activity in microsoft/phi-2 over 2 years ago

How to Train model with AutoModelForSequenceClassification?

👍 2

#20 opened over 2 years ago by

jerife

New activity in dfurman/Falcon-40B-Chat-v0.1 almost 3 years ago

qlora - need to be applied and few more places

#4 opened almost 3 years ago by

Asaf-Yehudai

New activity in timdettmers/openassistant-guanaco almost 3 years ago

Guanaco?

#1 opened almost 3 years ago by

edensn

New activity in eachadea/vicuna-13b-1.1 about 3 years ago

running the model in Python

#3 opened about 3 years ago by

Asaf-Yehudai

Asaf Yehudai

AI & ML interests

Recent Activity

Organizations

Asaf-Yehudai's activity

[Submission] HAL Leaderboard - 9 agentic benchmarks (246 entries)

Missing models compared to the Arena-Hard-v2.0-Preview

Access to the submission and evaluation data

Problem running the model

MT-Bench Results

Bug with example code:

How to Train model with AutoModelForSequenceClassification?

qlora - need to be applied and few more places

Guanaco?

running the model in Python