Haon Park

redteamhacker

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

liked a dataset 4 days ago

HAERAE-HUB/HAERAE-VISION

upvoted a paper 4 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

View all activity

Organizations

upvoted a paper 3 days ago

COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

Paper • 2601.01836 • Published 12 days ago • 7

liked a dataset 4 days ago

HAERAE-HUB/HAERAE-VISION

Viewer • Updated 4 days ago • 165 • 122 • 9

upvoted a paper 4 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Paper • 2601.06165 • Published 10 days ago • 15

liked 2 models 11 days ago

AIM-Intelligence/COMPASS_gemma-3-4b-it_LoRA

Image-to-Text • 4B • Updated 11 days ago • 9 • 2

AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA

Text Generation • 8B • Updated 11 days ago • 19 • 2

liked 2 datasets 11 days ago

AIM-Intelligence/COMPASS-Policy-aware-SFT-Dataset

Viewer • Updated 11 days ago • 4.12k • 22 • 2

AIM-Intelligence/COMPASS-Policy-Alignment-Testbed-Dataset

Viewer • Updated 11 days ago • 5.92k • 165 • 10

upvoted 5 papers about 1 month ago

X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates

Paper • 2509.08729 • Published Sep 10, 2025 • 1

ObjexMT: Objective Extraction and Metacognitive Calibration for LLM-as-a-Judge under Multi-Turn Jailbreaks

Paper • 2508.16889 • Published Aug 23, 2025 • 2

One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs

Paper • 2503.04856 • Published Mar 6, 2025 • 2

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

Paper • 2502.04757 • Published Feb 7, 2025 • 2

sudo rm -rf agentic_security

Paper • 2503.20279 • Published Mar 26, 2025 • 1

authored 8 papers about 1 month ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24, 2025 • 77