billm-mistral-7b-conll03-ner

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2046
Precision: 0.9273
Recall: 0.9393
F1: 0.9333
Accuracy: 0.9864

Inference

python -m pip install -U billm==0.1.1

from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge_and_unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.0499	1.0	1756	0.1085	0.9196	0.9287	0.9241	0.9845
0.0233	2.0	3512	0.0997	0.9249	0.9226	0.9237	0.9845
0.0097	3.0	5268	0.1343	0.9292	0.9386	0.9339	0.9870
0.0036	4.0	7024	0.1651	0.9245	0.9386	0.9315	0.9864
0.0012	5.0	8780	0.1839	0.9257	0.9373	0.9315	0.9863
0.0005	6.0	10536	0.2027	0.9258	0.9386	0.9321	0.9864
0.0002	7.0	12292	0.2022	0.9276	0.9384	0.9330	0.9864
0.0002	8.0	14048	0.2040	0.9274	0.9388	0.9331	0.9864
0.0001	9.0	15804	0.2048	0.9270	0.9393	0.9331	0.9864
0.0001	10.0	17560	0.2046	0.9273	0.9393	0.9333	0.9864

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.0.1
Datasets 2.16.0
Tokenizers 0.15.0

Citation

@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}

Downloads last month: 13

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WhereIsAI/billm-mistral-7b-conll03-ner

Base model

mistralai/Mistral-7B-v0.1

Adapter

(2348)

this model

Papers for WhereIsAI/billm-mistral-7b-conll03-ner

BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings

Paper • 2311.05296 • Published Nov 9, 2023

Label Supervised LLaMA Finetuning

Paper • 2310.01208 • Published Oct 2, 2023

WhereIsAI
/

billm-mistral-7b-conll03-ner

billm-mistral-7b-conll03-ner

Inference

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Citation

Model tree for WhereIsAI/billm-mistral-7b-conll03-ner

Papers for WhereIsAI/billm-mistral-7b-conll03-ner

BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings

Label Supervised LLaMA Finetuning

Evaluation results