26 4

Zhe Zhou

zhezhou1106

AI & ML interests

None yet

Recent Activity

updated a model 18 days ago

zhezhou1106/political-leaning-classifier-v2

published a model 18 days ago

zhezhou1106/political-leaning-classifier-v2

liked a model 29 days ago

deepseek-ai/DeepSeek-V4-Flash

View all activity

Organizations

upvoted an article about 1 month ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 293

upvoted 8 papers about 1 month ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 326

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 503

upvoted 2 papers about 2 months ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published Mar 27 • 364

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 351

upvoted an article 2 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 336

upvoted a collection 2 months ago

Deepseek Papers

Collection

Deepseek papers collection • 31 items • Updated about 7 hours ago • 345

upvoted 3 papers 3 months ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 105

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31, 2025 • 55

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published Feb 6 • 210

upvoted a paper 5 months ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 328

upvoted 3 papers 6 months ago

How Far Are We from Genuinely Useful Deep Research Agents?

Paper • 2512.01948 • Published Dec 1, 2025 • 58

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 95

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 106