When calling pymupdf4llm.to_markdown(doc, headers=keep_headers, footers=keep_footers, page_chunks=ret_page_chunks)
the page_chunks flag is ignored.
Looking at the source in
venv/lib/python3.11/site-packages/pymupdf4llm/init.py. line 97 its passed to
return parsed_doc.to_markdown
however in venv/lib/python3.11/site-packages/pymupdf4llm/helpers/document_layout.py
line 599. page_chunks is part of the discarded kwargs and not used.
I am using
PyMuPDF~=1.26.6
pymupdf-layout~=1.26.6
pymupdf4llm~=0.2.6
on mac and linux
btw - love PyMUPDF and you folks are great, fast responses, awesome product, its very much appreciated 
I’ve uploaded version 0.2.7 over the weekend. This does support again page_chunks for both to_markdown and the new to_text methods.
Please be aware, that the per-page dictionaries now contain a slightly changed set of keys:
"page_boxes" is a new list of identified and classified layout boundary boxes. They are in reading order: the "text" is in this sequence. Each list item is the tuple (x0, y0, x1, y1, "class") where class equals “table”, “picture”, “text”, …
"images"/"tables"/"words" are now omitted (= have a value of None always).
"page" has been renamed to "page_number".
1 Like
Awesome, thank you and I am grabbing now 
1 Like