Instructions to use 36kevin/SCAU-IGEM-Model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 36kevin/SCAU-IGEM-Model with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="36kevin/SCAU-IGEM-Model")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("36kevin/SCAU-IGEM-Model", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use 36kevin/SCAU-IGEM-Model with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "36kevin/SCAU-IGEM-Model" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "36kevin/SCAU-IGEM-Model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/36kevin/SCAU-IGEM-Model
- SGLang
How to use 36kevin/SCAU-IGEM-Model with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "36kevin/SCAU-IGEM-Model" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "36kevin/SCAU-IGEM-Model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "36kevin/SCAU-IGEM-Model" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "36kevin/SCAU-IGEM-Model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use 36kevin/SCAU-IGEM-Model with Docker Model Runner:
docker model run hf.co/36kevin/SCAU-IGEM-Model
π€ Qwen3 Finetuned Model Series (QLoRA)
This repository contains multiple variants of Qwen3-based models fine-tuned via QLoRA, including base generative models, Thinking models, and RAG companion models (Embedding + Reranker). All models are developed to support iGEM teams and synthetic biology research groups with functionalities such as experimental protocol assistance, iGEM rule explanations, and competition strategy guidance. They are suitable for dialogue, reasoning, and Retrieval-Augmented Generation (RAG) scenarios.
β Overall Evaluation Conclusion: After balancing multiple dimensions (performance, reasoning quality, resource consumption), the 4B-parameter base model demonstrates the best overall performance and is recommended as the default choice.
π¦ Model Overview
| Model Type | Parameters | Description |
|---|---|---|
| Base Models | 0.6B,1.7B,4B,8B,14B | Standard text generation models for general dialogue and instruction following |
| Thinking Model | 14B | Enables "Chain-of-Thought" capability, suitable for complex reasoning tasks |
| Embedding Model | 0.6B | Used for vector retrieval in RAG (sentence embedding) |
| Reranker Model | 0.6B | Used for re-ranking in RAG (cross-encoder style reranking) |
All models are fine-tuned from the original Qwen3 base weights.
βοΈ Finetuning Configuration (QLoRA)
- Quantization: 4-bit (NF4)
- Training Epochs: 4
- Per-device Batch Size: 2
- Gradient Accumulation Steps: 8 (effective batch size = 16)
- Learning Rate Warmup Steps: 4
- LoRA Configuration:
rank (r): 8alpha: 256target_modules:["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- training frameworkοΌ
transformers+peft+bitsandbytes