Embedded MinerU document extraction demo
Scrape a website and download its content as markdown
Convert PDFs to a Hugging Face dataset