Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hanxun Yu's picture
1 8 1

Hanxun Yu

JonnyYu828
FSCCS's profile picture
·
https://hanxunyu.github.io/

AI & ML interests

Multimodal LLMs, Spatial Intelligence, Embodied AI

Organizations

Zhejiang University's profile picture

authored 3 papers 3 months ago

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding

Paper • 2512.12560 • Published Dec 14, 2025

Physical Adversarial Attack meets Computer Vision: A Decade Survey

Paper • 2209.15179 • Published Sep 30, 2022
submitted a paper to Daily Papers 3 months ago

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Paper • 2601.22674 • Published Jan 30 • 5
authored 2 papers 3 months ago

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

Paper • 2503.00513 • Published Mar 1, 2025 • 2

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Paper • 2601.22674 • Published Jan 30 • 5
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs