Instructions to use answerdotai/ModernBERT-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use answerdotai/ModernBERT-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="answerdotai/ModernBERT-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base") model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base") - Notebooks
- Google Colab
- Kaggle
Performance vs the original architecture on approximate original data sizes (BooksCorpus/Wikipedia)
#54
by tollefj - opened
There's a tremendous difference between the data sizes used for pre-training ModernBERT compared to the original BERT models (1.7T tokens vs. 3.3B words). How much of the performance is gained from more comprehensive data sources? Or have I missed some details about this in the paper?