BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published 9 days ago • 28
BidirLM-Embedding Collection BidirLM is a family of 5 frontier bidirectional encoders, including an omnimodal variant at 2.5B. • 6 items • Updated 3 days ago • 1
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 3 days ago • 38
view article Article Introducing Cohere-transcribe: state-of-the-art speech recognition 15 days ago • 36
MolmoWeb-Data Collection This is the collection of all datasets in MolmoWebMix. • 6 items • Updated 17 days ago • 23
GLiNER-relex Collection Zero-shot joint NER and relation extraction models • 4 items • Updated 23 days ago • 2
Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated 25 days ago • 64
Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets Paper • 2602.22207 • Published Feb 25 • 43
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 501