Automatic Speech Recognition
Transformers
PyTorch
TensorFlow
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results (legacy)
Instructions to use openai/whisper-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-small")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-small") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-small") - Notebooks
- Google Colab
- Kaggle
Update `return_mask` param
Browse files- preprocessor_config.json +1 -1
- tokenizer_config.json +2 -1
preprocessor_config.json
CHANGED
|
@@ -16251,6 +16251,6 @@
|
|
| 16251 |
"padding_side": "right",
|
| 16252 |
"padding_value": 0.0,
|
| 16253 |
"processor_class": "WhisperProcessor",
|
| 16254 |
-
"return_attention_mask":
|
| 16255 |
"sampling_rate": 16000
|
| 16256 |
}
|
|
|
|
| 16251 |
"padding_side": "right",
|
| 16252 |
"padding_value": 0.0,
|
| 16253 |
"processor_class": "WhisperProcessor",
|
| 16254 |
+
"return_attention_mask": false,
|
| 16255 |
"sampling_rate": 16000
|
| 16256 |
}
|
tokenizer_config.json
CHANGED
|
@@ -19,9 +19,10 @@
|
|
| 19 |
},
|
| 20 |
"errors": "replace",
|
| 21 |
"model_max_length": 1024,
|
| 22 |
-
"name_or_path": "openai/whisper-
|
| 23 |
"pad_token": null,
|
| 24 |
"processor_class": "WhisperProcessor",
|
|
|
|
| 25 |
"special_tokens_map_file": null,
|
| 26 |
"tokenizer_class": "WhisperTokenizer",
|
| 27 |
"unk_token": {
|
|
|
|
| 19 |
},
|
| 20 |
"errors": "replace",
|
| 21 |
"model_max_length": 1024,
|
| 22 |
+
"name_or_path": "openai/whisper-small",
|
| 23 |
"pad_token": null,
|
| 24 |
"processor_class": "WhisperProcessor",
|
| 25 |
+
"return_attention_mask": false,
|
| 26 |
"special_tokens_map_file": null,
|
| 27 |
"tokenizer_class": "WhisperTokenizer",
|
| 28 |
"unk_token": {
|