Inference Providers
Active filters: modelopt
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 232k
• 10
Text Generation
• 5B • Updated • 78.6k
• 17
Text Generation
• 8B • Updated • 22.9k
• 5
Text Generation
• 15B • Updated • 2.22k
• 5
Text Generation
• 17B • Updated • 59.2k
• 16
nvidia/Qwen2.5-VL-7B-Instruct-FP8
Text Generation
• 8B • Updated • 1.54k
• 8
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 262k
• 15
nuphoto-ian/Qwen3-8B-QAT-NVFP4
5B • Updated txn545/Qwen3-Coder-30B-A3B-Instruct-NVFP4
16B • Updated • 7
• 1
shanjiaz/gpt-oss-120b-nvfp4-modelopt
59B • Updated • 48
• 4
shanjiaz/gpt-oss-20b-nvfp4-modelopt
11B • Updated • 284
• 1
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD
Image-Text-to-Text
• 6B • Updated • 534
• 14
baseten-admin/glm-4.6-fp4
177B • Updated • 1
baseten-admin/glm-4.6-fp8
353B • Updated • 1
baseten-admin/glm-4.6-fp4-mlp
183B • Updated • 17
shinedays1993/Qwen3-30B-A3B-nvfp4
16B • Updated shinedays1993/Qwen3-32B-nvfp4
17B • Updated Beambutbetter/Deepseek-V2-Lite-16B-NVFP4
Text Generation
• 8B • Updated • 1
• 3
ramblingpolymath/Qwen3-4B-Instruct-2507
2B • Updated • 2
literid/Qwen3-Coder-480B-A35B-Instruct_nvfp4_kv_fp8
241B • Updated • 2
DevQuasar/DeepSeek-R1-Distill-Llama-8B_nvfp4
Text Generation
• 5B • Updated • 48
DevQuasar/Qwen.Qwen3-4B-Thinking-2507_nvfp4
Text Generation
• 2B • Updated • 187
177B • Updated • 7
• 6
JeiganS/ML2-123B-Magnum-Diamond_fp8
Text Generation
• 123B • Updated • 1
guerilla7/Foundation-Sec-8B-Instruct-NVFP4-quantized
5B • Updated jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4-ONLY-MLP
42B • Updated jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4_A8
36B • Updated johnnyeric/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated-fp4
16B • Updated • 1
jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4_AWQ
36B • Updated Ex0bit/Qwen3-VLTO-32B-Instruct-NVFP4
Text Generation
• 17B • Updated • 75
• 1