Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
40.3
TFLOPS
6
2
15
hanzlajavaid
PRO
hanzla
Follow
naturelizer's profile picture
BaCreative's profile picture
KingHeroXiong's profile picture
50 followers
·
20 following
AI & ML interests
Direct Preference Optimization, Supervised Finetuning, Stable Diffusion
Recent Activity
posted
an
update
3 days ago
Reinforcement learning can sometimes lead to emergent behavior through much simpler training setups compared to large scale pre-training. I explored this idea by running a small GRPO experiment on Qwen3.5 4B, and the results were pretty exciting. Hypothesis: improving visual mathematical reasoning may also improve the model’s ability to transcribe LaTeX from images. I wrote a short breakdown of the experiment here: https://hanzlajavaid.github.io/blog/grpo-experiment-exploring-emergent-properties/
updated
a model
10 days ago
hanzla/Qwen3.5-4B-mathvista-GRPO
published
a model
10 days ago
hanzla/Qwen3.5-4B-mathvista-GRPO
View all activity
Organizations
hanzla
's models
32
Sort: Recently updated
hanzla/Qwen3.5-4B-mathvista-GRPO
5B
•
Updated
10 days ago
•
17
hanzla/Qwen3.5-4B-mathvista-GRPO-adapter
Updated
10 days ago
•
12
hanzla/Llama-3.1-1b-finetuned
Updated
Dec 5, 2025
hanzla/Qwen2-0.5B-SFT-summary
Updated
Oct 7, 2025
hanzla/Qwen2-0.5B-GRPO-summary_test_v5
Updated
Sep 30, 2025
hanzla/qwen25-0p5b-grpo-exp1
Text Generation
•
Updated
Sep 28, 2025
•
3
hanzla/qwen25_0p5b_grpo_ds
Text Generation
•
Updated
Sep 24, 2025
•
2
hanzla/Llama-3.1-1b-gsm8k-finetuned-labtest
Updated
Apr 22, 2025
hanzla/Llama-3.1-1b-gsm8k-finetuned-new
Updated
Apr 22, 2025
hanzla/Gemma3-1B-GRPO-summary_test_v4
Updated
Apr 21, 2025
hanzla/Qwen2-0.5B-GRPO-summary_test_v4
Updated
Apr 20, 2025
hanzla/Qwen2-0.5B-GRPO-summary_test_v3
Updated
Apr 20, 2025
hanzla/Qwen2-0.5B-GRPO-summary_test_v2
Updated
Apr 18, 2025
hanzla/Qwen2-0.5B-GRPO-summary_test
Updated
Apr 17, 2025
hanzla/Qwen2-0.5B-GRPO-test
Updated
Apr 17, 2025
hanzla/Falcon3-Mamba-R1-v0-4bit
Text Generation
•
7B
•
Updated
Mar 23, 2025
•
3
hanzla/Falcon3-Mamba-R1-v0
Text Generation
•
7B
•
Updated
Mar 22, 2025
•
8
•
11
hanzla/mamba-finetuned-deepspeed-s1-deepseek
Updated
Mar 17, 2025
hanzla/mamba-finetuned-deepspeed-openthoughts-r1-scorebased
Updated
Mar 17, 2025
hanzla/falcon3-mamba-finetuned-multigpu-openthoughts-r1-scorebased
Updated
Mar 6, 2025
hanzla/mamba-finetuned-s1
Updated
Mar 4, 2025
hanzla/mamba-finetuned-thinktoken
Updated
Mar 4, 2025
hanzla/mamba_essay_classifier
Updated
Dec 12, 2024
hanzla/bert-essay-classifier
Text Classification
•
0.1B
•
Updated
Dec 12, 2024
•
2
hanzla/Moondream-ocr-enhanced
Text Generation
•
2B
•
Updated
May 8, 2024
•
7
•
2
hanzla/gemma-2b-datascience-instruct-v5
Text Generation
•
Updated
Mar 31, 2024
•
7
hanzla/gemma-2b-datascience-instruct-v4.5
Text Generation
•
Updated
Mar 30, 2024
•
12
•
1
hanzla/gemma-2b-datascience-instruct-v4
Text Generation
•
Updated
Mar 30, 2024
•
7
hanzla/gemma-2b-datascience-instruct-v3.5
Text Generation
•
Updated
Mar 30, 2024
•
10
hanzla/gemma-2b-datascience-instruct-v3
Text Generation
•
3B
•
Updated
Mar 26, 2024
•
5
Previous
1
2
Next