BYOL
Collection
LLMs for Chichewa and Māori built by following the BYOL recipe: https://arxiv.org/abs/2601.10804 • 14 items • Updated
This model was produced by the BYOL framework for extending LLMs to low-resource languages.
This is an instruction-tuned (SFT) language model for Chichewa (nya). It was created by applying supervised fine-tuning on top of the BYOL Chichewa 4b CPT checkpoint, using translated instruction-following data (SmolTalk2 + AYA) generated via the BYOL framework.
This is an intermediate checkpoint used to produce the merged model. For best results, use the merged variant instead, which combines the language knowledge from CPT with the instruction-following ability from this model.
pip install -U transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ai-for-good-lab/byol-nya-4b-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16)
# Chat inference
messages = [{"role": "user", "content": "Tandiuzeni za dziko la Malawi."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
@article{zamir2026byolbringlanguagellms,
title={BYOL: Bring Your Own Language Into LLMs},
author={Syed Waqas Zamir and Wassim Hamidouche and Boulbaba Ben Amor and Luana Marotti and Inbal Becker-Reshef and Juan Lavista Ferres},
year={2026},
journal={arXiv:2601.10804},
url={https://arxiv.org/abs/2601.10804},
}
Base model
google/gemma-3-4b-pt