Token Classification
MLX
openmed
openai_privacy_filter
apple-silicon
pii
de-identification
privacy-filter
multilingual

OpenMed Privacy Filter (Multilingual) โ€” MLX 8-bit

A native MLX port of OpenMed/privacy-filter-multilingual for fast, on-device fine-grained PII detection across 54 categories and 16 languages on Apple Silicon. This 8-bit affine-quantized artifact reduces download size and resident memory; for the full-precision sibling see OpenMed/privacy-filter-multilingual-mlx.

Family at a glance. Same architecture and training data, three runtimes:

What it does

The model is a token classifier built on the OpenAI Privacy Filter architecture (openai_privacy_filter). It tags each token with a BIOES label across 54 PII span classes, then a Viterbi pass over the BIOES grammar yields clean entity spans. Languages covered: Arabic, Bengali, Chinese, Dutch, English, French, German, Hindi, Italian, Japanese, Korean, Portuguese, Spanish, Telugu, Turkish, Vietnamese.

Full label schema (217 BIOES labels)

The output space is O plus B-, I-, E-, S- for each of the 54 span classes (4 ร— 54 + 1 = 217). The runtime PrivacyFilterMLXPipeline runs Viterbi over this BIOES grammar, so the consumer sees clean grouped entities rather than raw token tags. The full id2label mapping is shipped alongside the weights in this repo.

For per-label accuracy, training recipe, and dataset details, see the base PyTorch checkpoint.

Architecture

Field Value
Source model type openai_privacy_filter
Source architecture OpenAIPrivacyFilterForTokenClassification
Hidden size 640
Transformer layers 8
Attention Grouped-Query (14 query heads / 2 KV heads, head_dim=64) with attention sinks
FFN Sparse Mixture-of-Experts โ€” 128 experts, top-4 routing, SwiGLU
Position encoding YARN-scaled RoPE (rope_theta=150_000, factor=32)
Context length 131,072 tokens (initial 4,096)
Tokenizer o200k_base (tiktoken) โ€” vocab 200,064
Output head Linear(640 โ†’ 217) with bias

File set

File Size Purpose
weights.safetensors ~1.4 GB Model weights in OpenMed-MLX layout
config.json ~19 KB Model + MLX runtime config
id2label.json ~5 KB Numeric ID โ†’ BIOES label string
openmed-mlx.json ~1 KB OpenMed MLX manifest (task, family, runtime hints)
tokenizer.json, tokenizer_config.json ~28 MB Source tokenizer files (kept for reference)

The MLX runtime uses tiktoken o200k_base directly for tokenization; the tokenizer.json is kept so consumers can inspect or re-tokenize via transformers if desired.

Label space (54 categories)

Category Typical examples
Identity FIRSTNAME, MIDDLENAME, LASTNAME, PREFIX, AGE, GENDER, SEX, EYECOLOR, HEIGHT, USERNAME, OCCUPATION, JOBTITLE, JOBDEPARTMENT, ORGANIZATION, USERAGENT
Contact EMAIL, PHONE, URL
Address STREET, BUILDINGNUMBER, SECONDARYADDRESS, CITY, COUNTY, STATE, ZIPCODE, GPSCOORDINATES, ORDINALDIRECTION
Dates & time DATE, DATEOFBIRTH, TIME
Government IDs SSN
Financial ACCOUNTNAME, BANKACCOUNT, IBAN, BIC, CREDITCARD, CREDITCARDISSUER, CVV, PIN, MASKEDNUMBER, AMOUNT, CURRENCY, CURRENCYCODE, CURRENCYNAME, CURRENCYSYMBOL
Crypto BITCOINADDRESS, ETHEREUMADDRESS, LITECOINADDRESS
Vehicle VIN, VRM
Digital IPADDRESS, MACADDRESS, IMEI
Auth PASSWORD

Quick start

With OpenMed โ€” recommended

OpenMed gives you a single extract_pii() / deidentify() API that auto-selects MLX on Apple Silicon and PyTorch elsewhere โ€” same code on every host.

pip install -U "openmed[mlx]"
from openmed import extract_pii, deidentify

text = (
    "Patient Sarah Johnson (DOB 03/15/1985), phone 415-555-0123, email sarah.johnson@example.com."
)

# Extract grouped entity spans (runs on MLX here, PyTorch fallback elsewhere)
result = extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
for ent in result.entities:
    print(f"{ent.label:30s} {ent.text!r}  conf={ent.confidence:.2f}")

# De-identify
masked = deidentify(text, method="mask",
                    model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
fake   = deidentify(
    text,
    method="replace",
    model_name="OpenMed/privacy-filter-multilingual-mlx-8bit",
    consistent=True,
    seed=42,   # deterministic locale-aware Faker surrogates
)

When MLX isn't available (Linux, Windows, Intel Mac, missing mlx package), this exact same call automatically falls back to the PyTorch checkpoint OpenMed/privacy-filter-multilingual with a one-time warning. Family-aware fallback: a Multilingual MLX request never substitutes an unrelated baseline.

Direct MLX usage (lower-level)

from huggingface_hub import snapshot_download
from openmed.mlx.inference import PrivacyFilterMLXPipeline

model_path = snapshot_download("OpenMed/privacy-filter-multilingual-mlx-8bit")
pipe = PrivacyFilterMLXPipeline(model_path)

print(pipe("Email me at alice.smith@example.com after 5pm."))
# [{'entity_group': 'EMAIL',
#   'score': 0.92,
#   'word': 'alice.smith@example.com',
#   'start': 12,
#   'end': 35}]

The pipeline returns a list of dicts with entity_group, score, word, start, and end (character offsets into the input string).

Hardware notes

  • Designed for Apple Silicon (M-series GPUs); CPU inference works but is slower.
  • Tested on macOS with mlx>=0.18. The MLX runtime in this repo is independent of mlx_lm (token classification, not causal LM).
  • Lower latency / smaller memory than the BF16 sibling.

Credits & Acknowledgements

This artifact wouldn't exist without two open-source releases โ€” sincere thanks to both teams:

Additional thanks to Apple for MLX and the HuggingFace team for the model-distribution ecosystem.

License

Apache 2.0.

Downloads last month
73
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for OpenMed/privacy-filter-multilingual-mlx-8bit

Finetuned
(2)
this model

Datasets used to train OpenMed/privacy-filter-multilingual-mlx-8bit

Collection including OpenMed/privacy-filter-multilingual-mlx-8bit