Flatten the shapes

ever4andrews · November 7, 2025, 12:11pm

I have PDF having comments/markups like different shapes or text. Now i want to flatten/render the shapes which are available in the PDF. but preserving the text as it is no flatten. in comments panel also no need to show it.

any one could you please help me on this.

Jamie_Lemon · November 7, 2025, 12:54pm

@ever4andrews Welcome! Did you try Document.bake() ? If I have understood you correctly it might provide what you need.

ever4andrews · November 10, 2025, 7:22am

Excellent it’s works for me. last few days i struggled a lot.
Thank you so much @Jamie_Lemon

ever4andrews · November 18, 2025, 5:01pm

Hi @Jamie_Lemon ,

Could you please help me below requirements. i am writing the script for PDF converter for Annotation flattening and standard page size of Letter

Flatten the annotations working with doc.bake
Input PDF page may or maynot be in “Letter” size. but i need output as “Letter” only with preserving the text
if input page was scanned we can use it as insert_image() method.

Till now i have tried multiple methods. But if i get annotations, then i am loosing the “Letter” size. or else vice versa.

Kindly help me out on this. because i don’t have much time to explore more on this.

Jamie_Lemon · November 18, 2025, 10:21pm

Hi @ever4andrews .

In your case I think because your input PDF can be any page dimension but you always need to make it fit into “Letter” size (presumably portrait). Then the best thing you can do is flatten each page to an image and then scale it as best as possible into the new letter pages.

The script below should do this okay:


import pymupdf

def flatten_pdf_to_letter_size(input_pdf, output_pdf, dpi=150):
    """
    Flatten a PDF by converting each page to an image on Letter-sized pages.
    
    Args:
        input_pdf: Path to input PDF file
        output_pdf: Path to output PDF file
        dpi: Resolution for rendering (default 150)
    """
    # See: https://pymupdf.readthedocs.io/en/latest/functions.html#paper_size
    letter_size = pymupdf.paper_size("letter")
    LETTER_WIDTH = letter_size[0]
    LETTER_HEIGHT = letter_size[1]

    # Open the input PDF
    doc = pymupdf.open(input_pdf)
    
    # Create a new PDF for output
    output_doc = pymupdf.open()
    
    # Process each page
    for page_num in range(len(doc)):
        page = doc[page_num]
        
        # Render page to a pixmap (image)
        zoom = dpi / 72
        mat = pymupdf.Matrix(zoom, zoom)
        pix = page.get_pixmap(matrix=mat)
        
        # Create a new Letter-sized page
        new_page = output_doc.new_page(width=LETTER_WIDTH, height=LETTER_HEIGHT)
        
        # Calculate scaling to fit image on Letter page while maintaining aspect ratio
        page_rect = pymupdf.Rect(0, 0, LETTER_WIDTH, LETTER_HEIGHT)
        
        # Get the image dimensions in points (convert back from pixels)
        img_width = pix.width / zoom
        img_height = pix.height / zoom
        
        # Calculate scaling factor to fit within Letter size
        scale_x = LETTER_WIDTH / img_width
        scale_y = LETTER_HEIGHT / img_height
        scale = min(scale_x, scale_y)  # Use smaller scale to fit entirely
        
        # Calculate centered position
        scaled_width = img_width * scale
        scaled_height = img_height * scale
        x_offset = (LETTER_WIDTH - scaled_width) / 2
        y_offset = (LETTER_HEIGHT - scaled_height) / 2
        
        # Create rectangle for image placement
        img_rect = pymupdf.Rect(x_offset, y_offset,
                            x_offset + scaled_width, 
                            y_offset + scaled_height)
        
        # Insert the image
        new_page.insert_image(img_rect, pixmap=pix)
    
    # Save the flattened PDF
    output_doc.save(output_pdf)
    output_doc.close()
    doc.close()
    
    print(f"Flattened PDF saved to {output_pdf}")

Usage


flatten_pdf_to_letter_size("input.pdf", "output_flattened.pdf", dpi=150)

Hope this helps!

ever4andrews · November 20, 2025, 4:42am

Thanks for your support @Jamie_Lemon
Yes i tried this.

But my requirement is if the page is digital format in different page size. then output page also need to be get as preserving text and fit it to “Letter”.

any how i already written for scanned pages as insert_image().

Jamie_Lemon · November 20, 2025, 1:28pm

Well in this case you should extract the information and then add it all to new letter page formats. How you do this is with respect to the design of the original page might be a challenge. You need to look at Text - PyMuPDF documentation & Page - PyMuPDF documentation

Alternatively you could just try to scale content into your Letter PDF, something like:



import pymupdf

doc = pymupdf.open()  # new empty PDF
fmt = pymupdf.paper_rect("letter")
page = doc.new_page(width=fmt.width, height=fmt.height)

src = pymupdf.open("input.pdf")  # show page 0 of this

# Scale factor (2.0 = 200%, 0.5 = 50%)

scale_factor = 0.5

#Create transformation matrix

matrix = pymupdf.Matrix(scale_factor, scale_factor)

page.show_pdf_page(
page.rect * matrix,  # New size
src,
0
)

doc.save("output.pdf")

Topic		Replies	Views
Extract the top half of multiple pages and add them to new pages PyMuPDF	7	35	August 14, 2025
Pymupdf4llm parsing takes excessively long time PyMuPDF	2	63	December 4, 2025
Why is this graphic NOT extracted as images by pymupdf4llm.to_markdown(write_images=True) PyMuPDF	5	70	July 22, 2025
Graphic wrongly placed in md file output from pymupdf4llm.to_markdown PyMuPDF	11	56	July 22, 2025
BUG: parameter page_chunks is ignored when passed to pymupdf4llm.to_markdown PyMuPDF	2	26	December 8, 2025

Flatten the shapes

Usage

Related topics