Text Generation
Transformers
Safetensors
Chinese
English
tensormind
causal-lm
chinese
custom-code
conversational
custom_code
Eval Results (legacy)
Instructions to use TensorMind/TensorMind-0.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TensorMind/TensorMind-0.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TensorMind/TensorMind-0.5B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("TensorMind/TensorMind-0.5B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TensorMind/TensorMind-0.5B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TensorMind/TensorMind-0.5B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TensorMind/TensorMind-0.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/TensorMind/TensorMind-0.5B
- SGLang
How to use TensorMind/TensorMind-0.5B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TensorMind/TensorMind-0.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TensorMind/TensorMind-0.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TensorMind/TensorMind-0.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TensorMind/TensorMind-0.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use TensorMind/TensorMind-0.5B with Docker Model Runner:
docker model run hf.co/TensorMind/TensorMind-0.5B
TensorMind (0.5B)
TensorMind is a 536.9M-parameter causal language model for lightweight Chinese/English text generation.
Model Details
- Architecture: Decoder-only Transformer (
TensorMindForCausalLM) - Layers: 32
- Hidden size: 1024
- Heads / KV heads: 16 / 8 (GQA)
- Context length: 32,768
- Vocab size: 32,768
- Positional encoding: RoPE
- Activation: SiLU
- Parameters: 536,941,568 (~0.5B)
Quick Start
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
repo_id = "TensorMind/TensorMind"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
trust_remote_code=True,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
)
prompt = "请用三句话介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Benchmark Snapshot
Evaluation time: 2026-03-07 00:40 (UTC+8), zero-shot (n-shot=0).
| Model | Params | C-Eval | CMMLU | A-CLUE | TMMLU+ | AGIEval |
|---|---|---|---|---|---|---|
| TensorMind | 0.5B | 27.27 | 25.26 | 25.43 | 24.96 | 33.56 |
Intended Use
- Lightweight chat and text generation
- Local experimentation and teaching
- Baseline model for research and fine-tuning
Limitations
- This is a small model and can produce factual errors.
- Benchmark numbers above are from multiple-choice style evaluations and do not fully represent open-ended generation quality.
- Outputs may contain bias or unsafe content; apply filtering for production use.
License
MIT License.
- Downloads last month
- 28
Evaluation results
- C-Eval (0-shot) on C-Evalself-reported27.270
- CMMLU (0-shot) on CMMLUself-reported25.260
- A-CLUE (0-shot) on A-CLUEself-reported25.430
- TMMLU+ (0-shot) on TMMLU+self-reported24.960

