Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.10104

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19, 2025 • 11.3k • 226
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 149k • 88
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 18.7k • 26
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 1.04M • 123

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 8
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 26
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 31

Physical AI Papers

List of papers we review/read for building physical AI models

ViViT: A Video Vision Transformer

Paper • 2103.15691 • Published Mar 29, 2021 • 4
DINO-Foresight: Looking into the Future with DINO

Paper • 2412.11673 • Published Dec 16, 2024 • 1
Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Paper • 2601.04575 • Published Jan 8 • 12
Learning Long-Context Diffusion Policies via Past-Token Prediction

Paper • 2505.09561 • Published May 14, 2025

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 122
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 629
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 956

Research Interests

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

Visual Backbone

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19, 2025 • 11.3k • 226
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 149k • 88
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 18.7k • 26
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 1.04M • 123

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 8
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 26
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 31

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Physical AI Papers

List of papers we review/read for building physical AI models

ViViT: A Video Vision Transformer

Paper • 2103.15691 • Published Mar 29, 2021 • 4
DINO-Foresight: Looking into the Future with DINO

Paper • 2412.11673 • Published Dec 16, 2024 • 1
Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Paper • 2601.04575 • Published Jan 8 • 12
Learning Long-Context Diffusion Policies via Past-Token Prediction

Paper • 2505.09561 • Published May 14, 2025

Open-Source Foundations for Modern AI Systems

open-source libraries that form the infrastructure layer of modern AI systems, spanning model dev, retrieval, orchestration, evaluation, and MLOPS.

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 7
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 629
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 956

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 122
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

Research Interests

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

Visual Backbone

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

Previous
1
2
3
4
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs