BYOL Māori 4B

This model was produced by the BYOL framework for extending LLMs to low-resource languages.

Base model: google/gemma-3-4b-pt
Language: Māori (mri)
Training stage: Merged (CPT + IT via model merging)
License: Gemma Terms of Use (derived from Gemma 3)
Paper: BYOL: Bring Your Own Language Into LLMs
Code: github.com/microsoft/byol

Model Description

This is a merged language model for Māori (mri) that combines the language knowledge acquired during continual pre-training with the instruction-following capabilities from supervised fine-tuning. It was produced by merging BYOL Māori 4b CPT and BYOL Māori 4b IT checkpoints back into the original Gemma 3 instruction model, using the BYOL framework.

This is the recommended model for most users. It supports chat/instruction-following and has the strongest overall performance on Māori benchmarks (see the paper for evaluation results).

Usage

pip install -U transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ai-for-good-lab/byol-mri-4b-merged"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16)

# Chat inference
messages = [{"role": "user", "content": "Kōrerotia mai mō Aotearoa."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@article{zamir2026byolbringlanguagellms,
    title={BYOL: Bring Your Own Language Into LLMs},
    author={Syed Waqas Zamir and Wassim Hamidouche and Boulbaba Ben Amor and Luana Marotti and Inbal Becker-Reshef and Juan Lavista Ferres},
    year={2026},
    journal={arXiv:2601.10804},
    url={https://arxiv.org/abs/2601.10804},
}