Images within a table not extracted

Viswa · June 16, 2026, 9:58am

Hi,

I am trying to create a markdown from PDF and issue happens to images that are embedded within a table.

PDF I am trying to extract: https://cars.tatamotors.com/content/dam/tml/pv/general/service/owners-manual/pdf/harrier/harrier-bs6-owners-manual-april-2026.pdf

Refer to Pages: 64, 65

The images in Pictogram column is not being extracted

I went through the forum and I tried with different options by setting image_size_limit=0, ignore_graphics=False, but still none of them is working.

import pymupdf4llm

FILE = "harrier-bs6-owners-manual-april-2026.pdf"

md_text = pymupdf4llm.to_markdown(FILE, pages=63, header=False, footer=False, embed_images=True, image_size_limit=0, ignore_graphics=False)

output = open("out-markdown.md", "w")

output.write(md_text)

output.close()

HaraldLieder · June 16, 2026, 10:15am

Welcome to the Forum @Viswa !

Images, hyperlinks and vector graphics inside table cells are currently out of scope - sorry.

Viswa · June 16, 2026, 10:55am

Thanks for the quick reply. Is there any plan for this feature to included in later releases which I could look for?

Also, is there a way for identifying there is an image but not extracted from the table ?

HaraldLieder · June 16, 2026, 11:36am

We do intend to support this, but there exists no schedule yet: our list of planned enhancements is loooong .

But you can easily determine whether there exist image(s) in side any region on the page, e.g. also inside a (table or cell or whatever) bbox:

images = page.get_image_info()  # list of images on page (metadata only)
images_in_bbox = [img for img in images if img["bbox"] in pymupdf.Rect(bbox)]

Topic		Replies	Views
Why is this graphic NOT extracted as images by pymupdf4llm.to_markdown(write_images=True) PyMuPDF	3	107	July 21, 2025
Pymupdf layout table detection issue PyMuPDF	14	162	February 24, 2026
Why is this diagraph NOT extracted as images by pymupdf4llm.to_markdown(write_images=True) PyMuPDF	3	90	July 18, 2025
BUG: list index out of range using new layout feature PyMuPDF	16	106	December 11, 2025
Some drawings missing from pymupdf4llm output PyMuPDF	3	75	March 2, 2026

Images within a table not extracted

Related topics