qwen3-0.6-finetuned

This model is a fine-tuned version of Qwen/Qwen3-0.6B on the sh0416/ag_news dataset. It achieved an F1 of 0.911 on the evaluation set.

If you would like to test the fine-tuned adapter yourself, you can load it using AutoModelForSequenceClassification.from_pretrained() and pass cli08/qwen3-0.6-finetuned as the model.

Fine-tuning Results

Initial F1 Fine-tuned F1
0.133 0.911

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • num_train_epochs: 2
  • lr_scheduler_type: 'linear'
  • gradient_accumulation_steps: 4
  • weight_decay: 0.01
  • per_device_train_batch_size: 8

Framework versions

  • PEFT 0.17.1
  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.4.2
  • Tokenizers 0.22.1

Environment

Kaggle notebook with two Nvidia T4 GPU's

Source Code

Training code is hosted on GitHub

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cli08/qwen3-0.6-finetuned

Finetuned
Qwen/Qwen3-0.6B
Adapter
(374)
this model

Dataset used to train cli08/qwen3-0.6-finetuned