GigaSpeechBench: A Real-World Multilingual Speech-to-Text Benchmark Paper • 2606.28884 • Published 6 days ago
GigaSpeech Series Collection Evolving, Large-Scale, and Multi-domain ASR Corpus • 6 items • Updated 7 days ago
UAT: Unified Audio-Text Diffusion for Audio Generation, Editing, and Captioning Paper • 2606.04939 • Published 30 days ago
Evaluating the Expressive Appropriateness of Speech in Rich Contexts Paper • 2605.09413 • Published May 10 • 5
Evaluating the Expressive Appropriateness of Speech in Rich Contexts Paper • 2605.09413 • Published May 10 • 5
WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling Paper • 2605.06407 • Published May 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published Apr 13 • 69
Representation-Regularized Convolutional Audio Transformer for Audio Understanding Paper • 2601.21612 • Published Jan 29 • 1