Xiaoyang Cao
Sean13
ยท
AI & ML interests
RLFH, Deep Reinfrocement Learning
Recent Activity
updated
a model
about 1 month ago
Sean13/llama-8b-instruct-v0.2-cpo-full-label_smoothing-0.1
published
a model
about 1 month ago
Sean13/llama-8b-instruct-v0.2-cpo-full-label_smoothing-0.1
updated
a model
about 1 month ago
Sean13/mistral-7b-instruct-v0.2-cpo-full-label_smoothing-0.1
Organizations
None yet