StyleECU
StyleECU is a style embedding model for Spanish, obtained by fine-tuning mStyleDistance on SynthSTEL-ES, a purpose-built Spanish contrastive dataset of 51,400 triplets covering 71 stylistic dimensions.
Model Description
StyleECU specializes the mStyleDistance embedding space toward stylistic phenomena most relevant to Spanish, including dialectal variation (voseo/tuteo), expressive morphology, syntactic complexity, and digital style.
Training
- Base model:
StyleDistance/mstyledistance - Training objective: TripletLoss (contrastive learning)
- Dataset: cespinr/SynthSTEL-ES
- Training size: 51,400 triplets
- Epochs: 2
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("cespinr/StyleECU")
embeddings = model.encode(["Tu texto aquí"])
Evaluation
Evaluated on PAN author profiling tasks (Spanish):
| Task | Base (mStyleDistance) | StyleECU | Δ |
|---|---|---|---|
| PAN 2018 – Gender prediction | baseline | +3 pp | +3 pp |
| PAN 2021 – Hate speech spreaders | 0.70 | 0.81 | +11 pp |
Authors
César Espín-Riofrio — Researcher, University of Guayaquil, Ecuador & SINAI, University of Jaén, Spain | Director, Research Project FCI-036-2023, University of Guayaquil, Ecuador
Arturo Montejo-Ráez — Researcher, SINAI, University of Jaén, Spain
Steven Ramírez-Gurumendi, Gabriel Delgado-Gómez University of Guayaquil, Ecuador — Research Project FCI-036-2023
Citation
If you use this model, please cite:
Paper under review. Citation will be updated upon publication.
@misc{espinriofrio2026stylecu,
author = {Espín-Riofrio, César and Montejo-Ráez, Arturo and
Ramírez-Gurumendi, Steven and Delgado-Gómez, Gabriel},
title = {StyleECU: A Spanish Style Embedding Model},
year = {2026},
url = {https://huggingface.co/cespinr/StyleECU}
}
- Downloads last month
- 69
Model tree for cespinr/StyleECU
Base model
FacebookAI/xlm-roberta-base