A newer version of this model is available: pvlabs/Chytrej1.5-90M-Base

Chytrej1-90M-Base

The first model in the Chytrej series. A fully custom pretrained language model built from scratch on the LLaMA architecture.

Chytrej (Czech slang for "clever/smart") is a long-term model series by PingVortex Labs. Every model in the series will be fully custom pretrained from scratch, then the model may be instruction fine-tuned on the custom base. The ongoing goal: every release must at least know the capital of France.

Built by PingVortex Labs.

Model Details

Parameters: 90M
Context length: 8,192 tokens
Language: English only
Format: base model
Architecture: LLaMA
License: Apache 2.0

Benchmarks

Evaluated with lm-eval-harness, 0-shot:

Task	Metric	Score
ARC-Easy	acc	39.73%
ARC-Easy	acc_norm	34.47%

Usage

from transformers import LlamaForCausalLM, PreTrainedTokenizerFast

model = LlamaForCausalLM.from_pretrained("pvlabs/Chytrej1-90M-Base")
tokenizer = PreTrainedTokenizerFast.from_pretrained("pvlabs/Chytrej1-90M-Base")

# response: The capital of France is the city of Paris...
prompt = "The capital of France is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0]))

The Chytrej Series plan:

Fully custom pretrained base models at various scales
Instruction fine-tuned variants only on top of our base models
Every release must know the capital of France to be sure it has some knowledge. There may be some exceptions.
No fine-tuned existing models, everything from scratch

Made by PingVortex.

Downloads last month: 82

Safetensors

Model size

89.1M params

Tensor type

BF16

Collection including pvlabs/Chytrej1-90M-Base

Chytrej

Collection

4 items • Updated Apr 16