SketchVLM: Vision language models can annotate images to explain thoughts and guide users Paper • 2604.22875 • Published 20 days ago • 35
FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol Paper • 2603.24943 • Published Mar 26 • 12
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published Mar 24 • 33
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published Mar 26 • 57
SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM Paper • 2603.23386 • Published Mar 24 • 40
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published Mar 24 • 62
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 154
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8 Text Generation • 124B • Updated 13 days ago • 372k • 244
Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF Image-Text-to-Text • 9B • Updated Apr 6 • 149k • 343
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance Paper • 2305.05176 • Published May 9, 2023 • 7
meta-llama/Llama-3.1-8B-Instruct Text Generation • 8B • Updated Sep 25, 2024 • 9.55M • • 5.82k