AI & ML interests

The AI community building the future.

Recent Activity

Articles

allenai/Molmo2-8B

1
#6872 opened about 2 hours ago by
davanstrien

XiaomiMiMo/MiMo-V2-Flash

3
1
#6868 opened about 3 hours ago by
davanstrien
AdinaY 
posted an update about 8 hours ago
view post
Post
201
Finch 💰 an enterprise-grade benchmark that measures whether AI agents can truly handle real world finance & accounting work.

FinWorkBench/Finch

✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks
✨ Tests end-to-end finance workflows
✨ Multimodal & cross-file reasoning
✨ Expert annotated (700+ hours) and genuinely challenging hard
sergiopaniego 
posted an update 1 day ago
sergiopaniego 
posted an update 4 days ago
view post
Post
1940
🎄 last talk of the year about open AI and HF today at Universidad Rey Juan Carlos for undergrad students

always a pleasure to be back at my alma mater

🎅 slides: https://github.com/sergiopaniego/talks
  • 1 reply
·
sergiopaniego 
posted an update 5 days ago
view post
Post
1555
TRL now includes agent training support for GRPO‼️

Train 🕵️ agents with 🔧 tools, enabling interaction with external functions and APIs.

And of course, a new notebook and scripts to get you up to speed

📘 notebook tutorial: https://github.com/huggingface/trl/blob/main/examples/notebooks/grpo_agent.ipynb

📂 script examples: https://github.com/huggingface/trl/blob/main/examples/scripts/grpo_agent.py

📦 TRL v0.26.0 release: https://github.com/huggingface/trl/releases/tag/v0.26.0
  • 2 replies
·
sergiopaniego 
posted an update 6 days ago
view post
Post
2756
ICYMI, you can fine-tune open LLMs using Claude Code

just tell it:
“Fine-tune Qwen3-0.6B on open-r1/codeforces-cots”

and Claude submits a real training job on HF GPUs using TRL.

it handles everything:
> dataset validation
> GPU selection
> training + Trackio monitoring
> job submission + cost estimation
when it’s done, your model is on the Hub, ready to use

read more about the process: https://huggingface.co/blog/hf-skills-training
angt 
posted an update 6 days ago
view post
Post
2537
installama.sh at the TigerBeetle 1000x World Tour !

Last week I had the chance to give a short talk during the TigerBeetle 1000x World Tour (organized by @jedisct1 👏 ) a fantastic event celebrating high-performance engineering and the people who love pushing systems to their limits!

In the talk, I focused on the CPU and Linux side of things, with a simple goal in mind: making the installation of llama.cpp instant, automatic, and optimal, no matter your OS or hardware setup.

For the curious, here are the links worth checking out:
Event page: https://tigerbeetle.com/event/1000x
GitHub repo: https://github.com/angt/installama.sh
Talk: https://youtu.be/pg5NOeJZf0o?si=9Dkcfi2TqjnT_30e

More improvements are coming soon. Stay tuned!
  • 1 reply
·
sergiopaniego 
posted an update 6 days ago
view post
Post
2188
We just released TRL v0.26.0!

It comes packed with updates:
> Agent training with tools in GRPO
> New CISPO & SAPO losses + reasoning rewards
> vLLM quantization in colocate mode
> Dataset shuffling in SFT
> Lots of NEW examples
> Tons of fixes and documentation improvements

  • 3 replies
·
sergiopaniego 
posted an update 7 days ago
sergiopaniego 
posted an update 11 days ago
view post
Post
2812
Want to get started with fine-tuning but don’t know where to begin? 🤓☝️

We’re expanding our collection of beginner-friendly free Colab notebooks so you can learn and fine-tune models using TRL at no cost

🔬 Check out the full list of free notebooks: https://huggingface.co/docs/trl/main/en/example_overview#notebooks

🔬 If you want more advanced content, we also have a lot to cover in the community tutorials: https://huggingface.co/docs/trl/community_tutorials

And now the obvious question: what would you like us to add next?
angt 
posted an update 12 days ago
view post
Post
1634
I'm excited to share that https://installama.sh is up and running! 🚀

On Linux / macOS / FreeBSD it is easier than ever:
curl https://installama.sh | sh


And Windows just joined the party 🥳
irm https://installama.sh | iex

Stay tuned for new backends on Windows!
sergiopaniego 
posted an update 13 days ago
view post
Post
2343
NEW: @mistralai released a fantastic family of multimodal models, Ministral 3.

You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO

Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index
  • 2 replies
·