AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration Paper • 2510.10395 • Published Oct 12 • 30
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark Paper • 2509.24897 • Published Sep 29 • 46
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Paper • 2410.21169 • Published Oct 28, 2024 • 30
MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation Paper • 2502.04176 • Published Feb 6