A open-source version for AI models built by North ML Open Source (all except some Spark and some Ignite variants.)
Nova Devs / Arthur
arthu1
·
AI & ML interests
11 year old
Built custom architecture (MoE + YaRN)
Trained on $1 GPU budget
50 downloads in 24 hours
Recent Activity
updated a model about 5 hours ago
arthu1/wind-arc-1-5-preview repliedto Crownelius's post about 7 hours ago
[DAY TWO] PROJECT CROWFEATHER - 5/1/2026
Que sera, what will he be?
Step 47,500 of 100,000. Loss hovering around 2.76 on 6.2B tokens. Throughput steady at 87k per second on the A100. Not a GH200, but she gets it done.
Still haven't named him. Scamp has a rascally charm. Quentin sounds like he'd wear a bow tie and think hard before speaking. Taking votes.
Phase two is what's keeping me up. Datasets everywhere and I can't pick. I'm fusing Google and DeepSeek's ideas: Gemma 4's alternating sliding and global attention, DeepSeek V4's Muon optimizer and WSD scheduler, Gemma 2's logit soft cap, and PaLM's z-loss. Sounds like peanut butter on a hamburger, but the loss curve says it works.
Tribe_v2 has real potential but needs more scaffolding than a barn raising before I throw it in. One thing's certain though. This model's gonna be a thinker. Not a Wikipedia parrot. Something that chews before it answers.
Finally got a use for my less popular datasets too. Some Opus-4.5-Writing-Style for polish. A few rows of Human-Archtypes-25k to see what personality bubbles up. Could be a poet, could be a grump. Either beats a flimsy fine-tune.
The bank's after my credit card. Until then, full steam.
Next model gets graphs. I swear.
-Shane published a model 1 day ago
arthu1/wind-edge-1.6-sft