You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pdf_docs = fitz.open("pdf", pdf_bytes)
for page_id, page in enumerate(pdf_docs):
page_imgs = page.get_images()
for img in page_imgs:
recs = page.get_image_rects(img, transform=True)
my code like this demo
recs = page.get_image_rects(img, transform=True)
│ │ └ (1555, 0, 300, 28, 1, 'Indexed', '', 'Im1', 'JPXDecode')
│ └ <function get_image_rects at 0x000001B37F685E40>
└ page 75 of <memory, doc# 1>
pix = Pixmap(page.parent, xref) # make pixmap of the image to compute MD5
│ │ │ └ 1555
│ │ └ <weakproxy at 0x000001B301178630 to Document at 0x000001B37E8D5910>
│ └ page 75 of <memory, doc# 1>
└ <class 'fitz.fitz.Pixmap'>
_fitz.Pixmap_swiginit(self, _fitz.new_Pixmap(*args))
│ │ │ │ │ └ (<weakproxy at 0x000001B301178630 to Document at 0x000001B37E8D5910>, 1555)
│ │ │ │ └ <built-in function new_Pixmap>
│ │ │ └ <module 'fitz._fitz' from
│ │ └ <unprintable Pixmap object>
│ └ <built-in function Pixmap_swiginit>
└ <module 'fitz._fitz' from '
there is stack when i meet this error
PyMuPDF version
1.23.7
Operating system
Windows
Python version
3.11
The text was updated successfully, but these errors were encountered:
It looks like the image with page=75 (zero-based) image=3 (zero-based) and xref=1555, is corrupted. But MuPDF isn't coping with particularly well with this - we end up getting this image's error for all images on page 75.
I'll mark this as an upstream bug and keep it open here.
Note that so far i've been testing this with PyMuPDF built with MuPDF master. It looks like MuPDF master may have a regression where a single corrupt JPX image causes PyMuPDF to return an error for all images on the same page.
But the current release of PyMuPDF-1.23.8 (which is built with MuPDF-1.23.7) appears to be handling things ok. The image with xref=1555 is corrupt, and it's returning an error for just that image. I think this is correct behaviour.
So i think this is not actually a bug in the current release after all.
Description of the bug
when i use the api page.get_image_rects,there is a RuntimeError: Failed to read JPX header
How to reproduce the bug
JFMA_15_11.pdf
page_id:75(page_id start from 0)
my code like this demo
there is stack when i meet this error
PyMuPDF version
1.23.7
Operating system
Windows
Python version
3.11
The text was updated successfully, but these errors were encountered: