Man Cub
mancub
ยท
AI & ML interests
None yet
Recent Activity
new activity about 16 hours ago
Minachist/Qwen3.6-35B-A3B-INT8-AutoRound:Crashes with newest vllm version (v0.20.1) new activity 1 day ago
froggeric/Qwen-Fixed-Chat-Templates:v13 stops dead after the first response new activity 3 days ago
froggeric/Qwen-Fixed-Chat-Templates:v11/v12 performance considerations with Claude Code?Organizations
None yet
Crashes with newest vllm version (v0.20.1)
14
#1 opened 9 days ago
by
Neiko2002
v13 stops dead after the first response
๐ 2
2
#14 opened 1 day ago
by
mancub
v11/v12 performance considerations with Claude Code?
3
#11 opened 3 days ago
by
mancub
When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27B
6
#6 opened 5 days ago
by
mancub
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half
#10 opened 6 days ago
by
mancub
Good quant!
12
#1 opened 11 days ago
by
qenme
INT8 version for TP=2 / dual Ampere GPUs?
๐ 1
#6 opened 7 days ago
by
mancub
Does not appear to work with the new google drafter MTP model
#2 opened 8 days ago
by
mancub
Is it supposed to work in vllm?
1
#2 opened 8 days ago
by
mancub
Avg Draft acceptance rate is low.
17
#2 opened 19 days ago
by
fouvy
OOM and context limits reached too soon
1
#5 opened 22 days ago
by
mancub
Unable to run on 3090
1
#1 opened 21 days ago
by
mancub
How to split this model between 2 (3) GPUs and CPU/RAM ?
30
#12 opened about 2 months ago
by
mancub
My personal vLLM launch cmd on my old personal 2x3090 workstation
7
#1 opened 2 months ago
by
tclf90
What was just updated and why?
๐ 1
2
#1 opened about 1 month ago
by
mancub
How to use it with llama-server ?
๐ 1
3
#1 opened about 2 months ago
by
mancub
Poor performance and pretty lobotomized
2
#1 opened about 2 months ago
by
mancub
Love the license, confused by some of the decisions.
๐ค๐ 16
15
#15 opened about 2 months ago
by
CyborgPaloma
It's really good.
๐ 1
26
#3 opened 3 months ago
by
Shuasimodo