Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14, 2025 • 62
Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms Paper • 2410.23144 • Published Oct 30, 2024 • 4