Yohan Na's picture

Yohan Na PRO

nayohan

·

nayohan

AI & ML interests

NLP, Dialogue systems

Recent Activity

liked a dataset about 23 hours ago

nebius/SWE-rebench-openhands-trajectories

upvoted a paper 4 days ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

liked a dataset 5 days ago

allenai/IF_multi_constraints_upto5

View all activity

Organizations

upvoted a paper 4 days ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published 11 days ago • 17

upvoted a collection 12 days ago

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano v3. • 7 items • Updated 11 days ago • 54

upvoted an article 27 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

about 1 month ago

•

555

upvoted an article about 1 month ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

Feb 4, 2025

•

28

upvoted a paper about 2 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 221

upvoted an article about 2 months ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3, 2025

•

53

upvoted a paper 2 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 119

upvoted a collection 3 months ago

KORMo pretraining datasets

The pretraining datasets for KORMo-10B were collected from diverse, publicly available source. • 14 items • Updated Oct 13, 2025 • 20

upvoted 2 papers 3 months ago

KORMo: Korean Open Reasoning Model for Everyone

Paper • 2510.09426 • Published Oct 10, 2025 • 83

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 26

upvoted a collection 3 months ago

Qwen3-VL

37 items • Updated 3 days ago • 553

upvoted 2 articles 3 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

132

Article

mmBERT: ModernBERT goes Multilingual

+4

Sep 9, 2025

•

132

upvoted a collection 3 months ago

[Dataset] FineWeb2 Edu Korean

5 items • Updated Jul 24, 2025 • 2

upvoted an article 4 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

+3

Aug 8, 2025

•

90

upvoted a collection 4 months ago

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs. • 6 items • Updated 11 days ago • 8

upvoted 2 papers 7 months ago

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17, 2025 • 46

Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions

Paper • 2506.00421 • Published May 31, 2025 • 5

upvoted a collection 8 months ago

Qwen3

84 items • Updated 3 days ago • 1.53k

upvoted a paper 9 months ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139