Hugging Face

Team

company

Verified

https://huggingface.co

huggingface

Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

ror new activity about 1 hour ago

huggingface/documentation-images:Upload images for the continuous async blog post

alvarobartt updated a dataset about 8 hours ago

huggingface/DEH-image-scan-data

lysandre updated a dataset about 23 hours ago

huggingface/transformers-metadata

View all activity

Papers

Qualixar OS: A Universal Operating System for AI Agent Orchestration

FineVision: Open Data Is All You Need

View all Papers

Articles

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

Jan 27

•

One Year Since the “DeepSeek Moment”

Jan 20

•

On the Shifting Global Compute Landscape

Oct 29, 2025

•

Announcing Hugging Face Fundamentals: A New Learning Track on DataCamp

Oct 16, 2025

•

Yay! Organizations can now publish blog Articles

Jan 20, 2025

•

View all articles

ror

in huggingface/documentation-images about 1 hour ago

Upload images for the continuous async blog post

#611 opened about 1 hour ago by

ror

alvarobartt

updated a dataset about 8 hours ago

huggingface/DEH-image-scan-data

Viewer • Updated about 6 hours ago • 4 • 18k • 10

lysandre

updated a dataset about 23 hours ago

huggingface/transformers-metadata

Viewer • Updated about 15 hours ago • 2.2k • 1.67k • 38

Chunte

in huggingface/documentation-images 1 day ago

Upload 24 files

#610 opened 1 day ago by

Chunte

updated a dataset 1 day ago

huggingface/documentation-images

Viewer • Updated about 1 hour ago • 59 • 2.56M • 133

sayakpaul

updated a dataset 1 day ago

huggingface/diffusers-metadata

Viewer • Updated 1 day ago • 96 • 1.82k • 27

qgallouedec

posted an update 3 days ago

Post

7624

TRL v1.3 ships day-one training support for Qwen 3.6 🚀

The new Qwen 3.6 family (Qwen/Qwen3.6-27B, Qwen/Qwen3.6-35B-A3B) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with {% generation %} markers, tool-call response schema routing, tiny test models for the VLM matrix.

SFT with assistant-only loss works out of the box:

from trl import SFTConfig, SFTTrainer

trainer = SFTTrainer(
    model="Qwen/Qwen3.6-27B",
    args=SFTConfig(assistant_only_loss=True),
    train_dataset=dataset,
)
trainer.train()

So does GRPO tool-calling — just hand tools=[...] to GRPOTrainer.

v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in trl vllm-serve (Qwen3 MTP / Eagle3 drafts), 12 more KTO ↔ DPO alignment PRs (KTO promotion to stable is now in reach), three more {% generation %} chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix.

Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0

akhaliq

submitted a paper to Daily Papers 6 days ago

Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published 7 days ago • 16

nielsr

submitted a paper to Daily Papers 6 days ago

Scaling Test-Time Compute for Agentic Coding

Paper • 2604.16529 • Published 13 days ago • 10

AdinaY

submitted a paper to Daily Papers 12 days ago

TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification

Paper • 2604.14531 • Published 13 days ago • 7

qgallouedec

posted an update 12 days ago

Post

1862

TRL v1.2 introduces the SSDTrainer 🚀

Simple Self-Distillation (SSD) from Apple's paper "Embarrassingly Simple Self-Distillation Improves Code Generation" is now available as an experimental trainer in TRL.

The recipe is as minimal as the name suggests: sample completions from the model itself at a training-time temperature, then fine-tune on those raw, unverified samples with plain cross-entropy. No reward model. No verifier. No teacher model. No reinforcement learning. Just prompts and the model.

from trl.experimental.ssd import SSDConfig, SSDTrainer

trainer = SSDTrainer(
    model="Qwen/Qwen3-4B-Instruct",
    args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95),
    train_dataset=dataset,
)
trainer.train()

v1.2 also ships expanded tool-calling support (LLaMA 3.1 / 3.2, DeepSeek-V3), another round of KTO ↔ DPO alignment getting us closer to promoting KTO to stable, a big GRPO simplification for overlong tool results, deprecation of use_transformers_paged, and key fixes for VLM response parsing.

Full release notes: https://github.com/huggingface/trl/releases/tag/v1.2.0

nielsr

submitted a paper to Daily Papers 13 days ago

Geometric Context Transformer for Streaming 3D Reconstruction

Paper • 2604.14141 • Published 14 days ago • 19

sergiopaniego

posted an update 14 days ago

Post

1215

Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy

And… it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team 🐎

Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple 🍎

How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed

You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder):
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py
or benchmark a checkpoint with the eval script:
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py

One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train × T_eval, so a broad band of configs works well. even very noisy samples still help

Want to dig deeper?

Paper: Embarrassingly Simple Self-Distillation Improves Code Generation (2604.01193)
Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer

victor

posted an update 15 days ago

Post

5161

Want to share my enthusiasm for zai-org/GLM-5.1 here too 🔥

I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:

- One-shot car physics with real drift mechanics (this is hard)

- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen

- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters

- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!

- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed

You are going to hear about this model a lot in the next months - open source let's go - and thanks z-ai🚀🚀

4 replies

sergiopaniego

posted an update 20 days ago

Post

399

Great experience yesterday at PyTorch Conf Europe in Paris 🇫🇷

We (w/ @kashif ) talked about training LLMs through interaction, using trajectories across games, browsers, or simulators

Room was packed, a clear sign of interest in where RL post-training is heading.

sharing the slides! 🤓
https://drive.google.com/file/d/16k7YRnf5EJEo0XjXGlRJ_hVeLoFWKyNP/view?usp=sharing

nielsr

submitted a paper to Daily Papers 20 days ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

Paper • 2604.04913 • Published 23 days ago • 11

akhaliq

submitted a paper to Daily Papers 26 days ago

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines

Paper • 2603.06679 • Published about 1 month ago • 6

nielsr

submitted a paper to Daily Papers 26 days ago

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Paper • 2603.28130 • Published 30 days ago • 11

sergiopaniego

posted an update 27 days ago

Post

2816

Gemma 4 💎 is here and it’s strong!

to celebrate, we’re rolling out in TRL:

> support for multimodal tool responses for environments (OpenEnv)
> an example to train it in CARLA for autonomous driving with image-based tool calls

go check it out 🏎️🏎️

blog: https://huggingface.co/blog/gemma4
script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla_vlm_gemma.py

sergiopaniego

posted an update 29 days ago

Post

2041

TRL is officially an adult 🥳

excited to announce TRL v1.0❗️

head to the blog to see how we got here and what’s next for this post-training library, designed to keep pace with the field

https://huggingface.co/blog/trl-v1

2 replies

AI & ML interests

Recent Activity

Papers

Articles

mlinter: a linter for Transformers modeling files

From doctest to runnable Markdown

State of Open Source on Hugging Face: Spring 2026

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

One Year Since the “DeepSeek Moment”

On the Shifting Global Compute Landscape

Announcing Hugging Face Fundamentals: A New Learning Track on DataCamp

Yay! Organizations can now publish blog Articles

Team members 186

huggingface's activity

Upload images for the continuous async blog post

Upload 24 files