PKU-Alignment

university

https://github.com/PKU-Alignment

AI & ML interests

Reinforcement Learning, Large Language Models, Value Alignment

Recent Activity

muchvo published a dataset 9 days ago

PKU-Alignment/VLA-Arena-Scenes

XuehaiPan authored a paper 20 days ago

AI Alignment: A Comprehensive Survey

XuehaiPan authored a paper 20 days ago

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

View all activity

muchvo

published a dataset 9 days ago

PKU-Alignment/VLA-Arena-Scenes

Updated 9 days ago • 12 • 1

XuehaiPan

authored 5 papers 20 days ago

AI Alignment: A Comprehensive Survey

Paper • 2310.19852 • Published Oct 30, 2023

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Paper • 2402.02416 • Published Feb 4, 2024 • 4

Reward Generalization in RLHF: A Topological Perspective

Paper • 2402.10184 • Published Feb 15, 2024

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 127

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 22 days ago • 228

alignmentforever

updated a dataset 2 months ago

PKU-Alignment/InterMT

Preview • Updated Oct 23 • 193

sitong-fang

updated a dataset 5 months ago

PKU-Alignment/TruthfulVQA-image

Viewer • Updated Jul 25 • 5.1k • 91 • 1

sitong-fang

published a dataset 5 months ago

PKU-Alignment/TruthfulVQA-image

Viewer • Updated Jul 25 • 5.1k • 91 • 1

sitong-fang

updated a dataset 5 months ago

PKU-Alignment/TruthfulVQA-text

Viewer • Updated Jul 24 • 15.3k • 18

sitong-fang

published a dataset 5 months ago

PKU-Alignment/TruthfulVQA-text

Viewer • Updated Jul 24 • 15.3k • 18

dayone3nder

updated a dataset 6 months ago

PKU-Alignment/self-monitor

Viewer • Updated Jul 9 • 11.3k • 79

dayone3nder

published a dataset 6 months ago

PKU-Alignment/self-monitor

Viewer • Updated Jul 9 • 11.3k • 79

Gaie

updated a collection 7 months ago

Language Model Resist Alignment

This repository hosts open-sourced models of "Language Model Resist Alignment" (ACL 2025 Main). • 302 items • Updated Jun 11 • 1