RLHF-And-Friends
community
AI & ML interests
None defined yet.
models 27
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT-lr-1e-5
Text Classification • 0.5B • Updated • 1
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT
Text Classification • 0.5B • Updated • 3
RLHF-And-Friends/TLDR-Qwen2-0.5B-SmallSFT
Text Generation • 0.5B • Updated • 7
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT-RM
Text Classification • 1B • Updated • 1
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT
Text Generation • 1B • Updated • 8
RLHF-And-Friends/Wiki-Lingua-Llama-3.2-3B-RM
Text Classification • 3B • Updated • 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM
Text Classification • 3B • Updated
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM-lr-1e-5
Text Classification • 3B • Updated • 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-lr-1e-5
Text Generation • 3B • Updated • 2
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT
Text Generation • 3B • Updated • 2
datasets 13
RLHF-And-Friends/alpaca-cleaned
Viewer • Updated • 51.8k • 3
RLHF-And-Friends/tldr-thematic
Viewer • Updated • 130k • 29
RLHF-And-Friends/wiki-lingua-ppo
Viewer • Updated • 493k • 3
RLHF-And-Friends/wiki-lingua-reward
Viewer • Updated • 77k • 5
RLHF-And-Friends/wiki-lingua-preference
Viewer • Updated • 77k • 15
RLHF-And-Friends/wiki-lingua-paired
Viewer • Updated • 77k • 15
RLHF-And-Friends/wiki-lingua
Viewer • Updated • 742k • 14
RLHF-And-Friends/helpsteer3-multilingual
Viewer • Updated • 8.06k • 32
RLHF-And-Friends/helpsteer3-code
Viewer • Updated • 8.86k • 22 • 2
RLHF-And-Friends/tldr-ppo
Viewer • Updated • 113k • 4