Instructions to use PaddlePaddle/PaddleOCR-VL-1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL-1.5 with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1.5") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
BBox coordinates in PaddleOCR-VL JSON don’t match PDF crop — how to correctly map/convert coordinates?
Hi, thanks for releasing PaddleOCR-VL — the parsing quality is great.
When I parse a PDF with PaddleOCR-VL (Model A), the output JSON includes bounding boxes (bbox). However, when I try to crop the PDF using those bbox coordinates (via pdfplumber), the cropped regions do not match the actual object positions(like table, figure)
What coordinate system does PaddleOCR-VL use for bbox in the JSON output?
That's how i call paddle pipeline:
pipeline = PaddleOCRVL(
pipeline_version="v1",
device="gpu:0",
use_layout_detection=True,
use_doc_orientation_classify=True,
use_doc_unwarping=True,
)
maybe "use_doc_unwarping=True" occurs this result.
Is there an official / recommended way to convert PaddleOCR-VL bbox to PDF page coordinates for accurate cropping in use_doc_unwarping?
i will hope to get u guys reply
thank u
Hi,@sogm1 when use_doc_unwarping is set to True, the image pixels will be shifted, which causes the output coordinates to no longer correspond to the original image. You need to set use_doc_unwarping to False.