BYOL Chichewa 4B IT

This model was produced by the BYOL framework for extending LLMs to low-resource languages.

Model Description

This is an instruction-tuned (SFT) language model for Chichewa (nya). It was created by applying supervised fine-tuning on top of the BYOL Chichewa 4b CPT checkpoint, using translated instruction-following data (SmolTalk2 + AYA) generated via the BYOL framework.

This is an intermediate checkpoint used to produce the merged model. For best results, use the merged variant instead, which combines the language knowledge from CPT with the instruction-following ability from this model.

Usage

pip install -U transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ai-for-good-lab/byol-nya-4b-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16)

# Chat inference
messages = [{"role": "user", "content": "Tandiuzeni za dziko la Malawi."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@article{zamir2026byolbringlanguagellms,
    title={BYOL: Bring Your Own Language Into LLMs},
    author={Syed Waqas Zamir and Wassim Hamidouche and Boulbaba Ben Amor and Luana Marotti and Inbal Becker-Reshef and Juan Lavista Ferres},
    year={2026},
    journal={arXiv:2601.10804},
    url={https://arxiv.org/abs/2601.10804},
}
Downloads last month
243
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ai-for-good-lab/byol-nya-4b-it

Finetuned
(287)
this model

Collection including ai-for-good-lab/byol-nya-4b-it

Paper for ai-for-good-lab/byol-nya-4b-it