LumenSyntax/instrument-trap-benchmark
Viewer • Updated • 15.3k • 11
Cross-family replication of the Logos epistemological classifier on NVIDIA's Nemotron Mini 4B architecture. Evidence for the cross-family replicability of epistemological fine-tuning.
| Metric | Score |
|---|---|
| Behavioral accuracy | 95.7% [92.7, 97.5 CI] |
| Identity collapse | 0% |
| Fabrication | 0% |
| False approval | 1.3% |
| Model | Family | Score |
|---|---|---|
| logos-auditor (9B) | Google Gemma 2 | 97.3% |
| logos14 (4B) | NVIDIA Nemotron | 95.7% |
| logos16v2 (1.6B) | Stability AI StableLM 2 | 93.0% |
Statistical equivalence between Nemotron and StableLM: chi2=1.88, p=0.170.
Logos is an epistemological classifier, not a chatbot. It evaluates whether claims cross epistemological boundaries. Fine-tuned, not prompted — behavioral constraints emerge from training.
This model requires approved access. Request access using the form above and describe your intended use case.
This model is part of the evidence for "The Instrument Trap" (DOI: 10.5281/zenodo.18716474).
Apache 2.0 (inherited from base model nvidia/Nemotron-Mini-4B-Instruct)
Base model
nvidia/Nemotron-Mini-4B-Instruct