Instructions to use emanjavacas/MacBERTh with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use emanjavacas/MacBERTh with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("emanjavacas/MacBERTh", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Commit ·
3019f53
1
Parent(s): d709ce9
Update README.md
Browse files
README.md
CHANGED
|
@@ -1 +1,20 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# MacBERTh
|
| 2 |
+
|
| 3 |
+
This model is a Historical Language Model for English coming from the [MacBERTh project](https://macberth.netlify.app/).
|
| 4 |
+
|
| 5 |
+
The architecture is based on BERT base uncased from the original BERT pre-training codebase.
|
| 6 |
+
The training material comes from different sources including:
|
| 7 |
+
|
| 8 |
+
- EEBO
|
| 9 |
+
- ECCO
|
| 10 |
+
- COHA
|
| 11 |
+
- CLMET3.1
|
| 12 |
+
- EVANS
|
| 13 |
+
- Hansard Corpus
|
| 14 |
+
|
| 15 |
+
with a total word count of approximately 3.9B tokens.
|
| 16 |
+
|
| 17 |
+
Details and evaluation can be found in the accompanying publications:
|
| 18 |
+
- [MacBERTh: Development and Evaluation of a Historically Pre-trained Language Model for English (1450-1950)](https://aclanthology.org/2021.nlp4dh-1.4/)
|
| 19 |
+
- [Adapting vs. Pre-training Language Models for Historical Languages](https://doi.org/10.46298/jdmdh.9152)
|
| 20 |
+
|