Universal-NER/Pile-NER-type
Viewer • Updated • 45.9k • 485 • 29
How to use Mit1208/phi-2-universal-NER with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
model = PeftModel.from_pretrained(base_model, "Mit1208/phi-2-universal-NER")This model is a fine-tuned version of microsoft/phi-2 on the Universal-NER/Pile-NER-type dataset.
This model shows power of small language model. We can finetune phi-2 on google colab free version. It's very simple and easy. I couldn't fine tuned whole model on free colab so used PEFT.
This model is fine tuned from Phi-2 and UniversalNER dataset.
Phi-2 model license changed to MIT but UniversalNER is still under research license so this model can be used for research purpose only.
I have used just 5 epochs in fine tuning.
https://github.com/mit1280/fined-tuning/blob/main/phi_2_fine_tune_using_PEFT%2Binference.ipynb
The following hyperparameters were used during training:
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from transformers import StoppingCriteria
config = PeftConfig.from_pretrained("Mit1208/phi-2-universal-NER")
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2",device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "Mit1208/phi-2-universal-NER", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Mit1208/phi-2-universal-NER", trust_remote_code=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
conversations = [ { "from": "human", "value": "Text: Mit Patel here from India"}, {"from": "gpt", "value": "I've read this text."},
{"from":"human", "value":"what is a name of the person in the text?"}]
inference_text = tokenizer.apply_chat_template(conversations, tokenize=False) + '<|im_start|>gpt:\n'
inputs = tokenizer(inference_text, return_tensors="pt", return_attention_mask=False).to(device)
class EosListStoppingCriteria(StoppingCriteria):
def __init__(self, eos_sequence = tokenizer.encode("<|im_end|>")):
self.eos_sequence = eos_sequence
def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
last_ids = input_ids[:,-len(self.eos_sequence):].tolist()
return self.eos_sequence in last_ids
outputs = model.generate(**inputs, max_length=512, pad_token_id= tokenizer.eos_token_id,
stopping_criteria = [EosListStoppingCriteria()])
text = tokenizer.batch_decode(outputs)[0]
print(text)
# Output
'''
<|im_start|>human
Text: Mit Patel here from India<|im_end|>
<|im_start|>gpt
I've read this text.<|im_end|>
<|im_start|>human
what is a name of the person in the text?<|im_end|>
<|im_start|>gpt:
["Mit Patel"]<|im_end|>
'''
Base model
microsoft/phi-2