Instructions to use SlavicNLP/slavicner-linking-single-out-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SlavicNLP/slavicner-linking-single-out-large with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SlavicNLP/slavicner-linking-single-out-large")

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("SlavicNLP/slavicner-linking-single-out-large")
model = AutoModelForSeq2SeqLM.from_pretrained("SlavicNLP/slavicner-linking-single-out-large")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SlavicNLP/slavicner-linking-single-out-large with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SlavicNLP/slavicner-linking-single-out-large"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SlavicNLP/slavicner-linking-single-out-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/SlavicNLP/slavicner-linking-single-out-large

SGLang

How to use SlavicNLP/slavicner-linking-single-out-large with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SlavicNLP/slavicner-linking-single-out-large" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SlavicNLP/slavicner-linking-single-out-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SlavicNLP/slavicner-linking-single-out-large" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SlavicNLP/slavicner-linking-single-out-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use SlavicNLP/slavicner-linking-single-out-large with Docker Model Runner:
```
docker model run hf.co/SlavicNLP/slavicner-linking-single-out-large
```

YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Model description

This is a baseline model for named entity lemmatization trained on the single-out topic split of the SlavicNER corpus.

Resources and Technical Documentation

Paper: Cross-lingual Named Entity Corpus for Slavic Languages, to appear in LREC-COLING 2024.
Annotation guidelines: https://arxiv.org/pdf/2404.00482
SlavicNER Corpus: https://github.com/SlavicNLP/SlavicNER

Evaluation

Language	Seq2seq	Support
PL	75.13	2 549
CS	77.92	1 137
RU	67.56	18 018
BG	63.60	6 085
SL	76.81	7 082
UK	58.94	3 085
All	68.75	37 956

Usage

You can use this model directly with a pipeline for text2text generation:

from transformers import pipeline

model_name = "SlavicNLP/slavicner-linking-single-out-large"
pipe = pipeline("text2text-generation", model_name)

texts = ["pl:Polsce", "cs:Velké Británii", "bg:българите", "ru:Великобританию",
         "sl:evropske komisije", "uk:Європейського агентства лікарських засобів"]

outputs = pipe(texts)

ids = [o['generated_text'] for o in outputs]
print(ids)
# ['GPE-Poland', 'GPE-Great-Britain', 'GPE-Bulgaria', 'GPE-Great-Britain',
#  'ORG-European-Commission', 'ORG-EMA-European-Medicines-Agency']

Citation

@inproceedings{piskorski-etal-2024-cross-lingual,
    title = "Cross-lingual Named Entity Corpus for {S}lavic Languages",
    author = "Piskorski, Jakub  and
      Marci{\'n}czuk, Micha{\l}  and
      Yangarber, Roman",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italy",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.369",
    pages = "4143--4157",
    abstract = "This paper presents a corpus manually annotated with named entities for six Slavic languages {---} Bulgarian, Czech, Polish, Slovenian, Russian,
                and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017{--}2023 as a part of the Workshops on Slavic Natural
                Language Processing. The corpus consists of 5,017 documents on seven topics. The documents are annotated with five classes of named entities.
                Each entity is described by a category, a lemma, and a unique cross-lingual identifier. We provide two train-tune dataset splits
                {---} single topic out and cross topics. For each split, we set benchmarks using a transformer-based neural network architecture
                with the pre-trained multilingual models {---} XLM-RoBERTa-large for named entity mention recognition and categorization,
                and mT5-large for named entity lemmatization and linking.",
}

Contact

Michał Marcińczuk (marcinczuk@gmail.com)

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F32

Collection including SlavicNLP/slavicner-linking-single-out-large

SlavicNLP-2024-NER-Baselines

Collection

7 items • Updated May 23, 2025

Paper for SlavicNLP/slavicner-linking-single-out-large

Cross-lingual Named Entity Corpus for Slavic Languages

Paper • 2404.00482 • Published Mar 30, 2024 • 3