Hanxun Yu's picture

Hanxun Yu

JonnyYu828

·

https://hanxunyu.github.io/

AI & ML interests

Multimodal LLMs, Spatial Intelligence, Embodied AI

Organizations

authored 3 papers 3 months ago

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding

Paper • 2512.12560 • Published Dec 14, 2025

Physical Adversarial Attack meets Computer Vision: A Decade Survey

Paper • 2209.15179 • Published Sep 30, 2022

submitted a paper to Daily Papers 3 months ago

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Paper • 2601.22674 • Published Jan 30 • 5

authored 2 papers 3 months ago

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

Paper • 2503.00513 • Published Mar 1, 2025 • 2

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Paper • 2601.22674 • Published Jan 30 • 5