BYOL Māori 4B

This model was produced by the BYOL framework for extending LLMs to low-resource languages.

Model Description

This is a merged language model for Māori (mri) that combines the language knowledge acquired during continual pre-training with the instruction-following capabilities from supervised fine-tuning. It was produced by merging BYOL Māori 4b CPT and BYOL Māori 4b IT checkpoints back into the original Gemma 3 instruction model, using the BYOL framework.

This is the recommended model for most users. It supports chat/instruction-following and has the strongest overall performance on Māori benchmarks (see the paper for evaluation results).

Usage

pip install -U transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ai-for-good-lab/byol-mri-4b-merged"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16)

# Chat inference
messages = [{"role": "user", "content": "Kōrerotia mai mō Aotearoa."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@article{zamir2026byolbringlanguagellms,
    title={BYOL: Bring Your Own Language Into LLMs},
    author={Syed Waqas Zamir and Wassim Hamidouche and Boulbaba Ben Amor and Luana Marotti and Inbal Becker-Reshef and Juan Lavista Ferres},
    year={2026},
    journal={arXiv:2601.10804},
    url={https://arxiv.org/abs/2601.10804},
}
Downloads last month
259
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ai-for-good-lab/byol-mri-4b-merged

Finetuned
(287)
this model

Collection including ai-for-good-lab/byol-mri-4b-merged

Paper for ai-for-good-lab/byol-mri-4b-merged