How can I extract the background color of a rect with specific coordinates?

Hi!

I have a list with the coordinates of specific cells from a PDF table and I am trying to extract the background color of those cells (particularly the blue ones)

Example_Report.pdf (32.2 KB)

.

I tried to extract the PDF graphics and match the Rects found with the Rects coordinates I have to then use “fill” but I am not being able to find the corresponding Rects.

Would really appreciate an hint on this, please.

For reference, the list of Rect coordinates of first column cells is:

[Rect(56.689998626708984, 116.22003173828125, 109.9000015258789, 130.1099853515625), Rect(56.689998626708984, 130.1099853515625, 109.9000015258789, 144.280029296875), Rect(56.689998626708984, 144.280029296875, 109.9000015258789, 158.46002197265625), Rect(56.689998626708984, 158.46002197265625, 109.9000015258789, 172.6300048828125), Rect(56.689998626708984, 172.6300048828125, 109.9000015258789, 186.79998779296875), Rect(56.689998626708984, 186.79998779296875, 109.9000015258789, 200.98004150390625), Rect(56.689998626708984, 200.98004150390625, 109.9000015258789, 215.1500244140625), Rect(56.689998626708984, 215.1500244140625, 109.9000015258789, 229.32000732421875), Rect(56.689998626708984, 229.32000732421875, 109.9000015258789, 243.5), Rect(56.689998626708984, 243.5, 109.9000015258789, 257.6700439453125), Rect(56.689998626708984, 257.6700439453125, 109.9000015258789, 271.84002685546875), Rect(56.689998626708984, 271.84002685546875, 109.9000015258789, 286.02001953125), Rect(56.689998626708984, 286.02001953125, 109.9000015258789, 300.19000244140625), Rect(56.689998626708984, 300.19000244140625, 109.9000015258789, 314.3599853515625), Rect(56.689998626708984, 314.3599853515625, 109.9000015258789, 328.5400390625), Rect(56.689998626708984, 328.5400390625, 109.9000015258789, 342.71002197265625), Rect(56.689998626708984, 342.71002197265625, 109.9000015258789, 356.8800048828125), Rect(56.689998626708984, 356.8800048828125, 109.9000015258789, 371.0600280761719), Rect(56.689998626708984, 371.0600280761719, 109.9000015258789, 385.2300109863281), Rect(56.689998626708984, 385.2300109863281, 109.9000015258789, 399.4000244140625), Rect(56.689998626708984, 399.4000244140625, 109.9000015258789, 413.58001708984375), Rect(56.689998626708984, 413.58001708984375, 109.9000015258789, 427.75), Rect(56.689998626708984, 427.75, 109.9000015258789, 441.9200134277344), Rect(56.689998626708984, 441.9200134277344, 109.9000015258789, 456.09002685546875), Rect(56.689998626708984, 456.09002685546875, 109.9000015258789, 470.27001953125), Rect(56.689998626708984, 470.27001953125, 109.9000015258789, 484.44000244140625), Rect(56.689998626708984, 484.44000244140625, 109.9000015258789, 498.6100158691406), Rect(56.689998626708984, 498.6100158691406, 109.9000015258789, 512.7900390625), Rect(56.689998626708984, 512.7900390625, 109.9000015258789, 526.9600219726562), Rect(56.689998626708984, 526.9600219726562, 109.9000015258789, 541.1300048828125), Rect(56.689998626708984, 541.1300048828125, 109.9000015258789, 555.31005859375), Rect(56.689998626708984, 555.31005859375, 109.9000015258789, 569.47998046875), Rect(56.689998626708984, 569.47998046875, 109.9000015258789, 583.6500244140625), Rect(56.689998626708984, 583.6500244140625, 109.9000015258789, 597.8300170898438), Rect(56.689998626708984, 597.8300170898438, 109.9000015258789, 612.0), Rect(56.689998626708984, 612.0, 109.9000015258789, 626.1700439453125), Rect(56.689998626708984, 626.1700439453125, 109.9000015258789, 640.3500366210938), Rect(56.689998626708984, 640.3500366210938, 109.9000015258789, 654.239990234375), Rect(56.689998626708984, 654.239990234375, 109.9000015258789, 668.4100341796875), Rect(56.689998626708984, 668.4100341796875, 109.9000015258789, 682.5800170898438), Rect(56.689998626708984, 682.5800170898438, 109.9000015258789, 696.760009765625), Rect(56.689998626708984, 696.760009765625, 109.9000015258789, 710.9299926757812), Rect(56.689998626708984, 710.9299926757812, 109.9000015258789, 725.1000366210938), Rect(56.689998626708984, 725.1000366210938, 109.9000015258789, 739.280029296875), Rect(56.689998626708984, 739.280029296875, 109.9000015258789, 753.4500122070312), Rect(56.689998626708984, 753.4500122070312, 109.9000015258789, 767.6199951171875), Rect(56.689998626708984, 767.6199951171875, 109.9000015258789, 781.7999877929688), Rect(56.689998626708984, 781.7999877929688, 109.9000015258789, 795.9700317382812)]

And I attached a copy of an example of the PD I am using.

Thank you!

Hi @Ana_Guedes So did you already try something like:

doc = pymupdf.open("Example_Report.pdf")
page = doc[0]
pprint.pp(page.get_drawings())


And then look at those rects with the light blue color (0.8156859874725342, 0.8588240146636963, 0.8941180109977722) ?

An arbitrary rectangle need not have its own or one unique “background color”. After all, it could exist above an image with whatever content or where multiple vector graphic areas intersect.
But if you have extracted vector graphics via page.get_grawings(), then each of its paths does have the "fill" key which is either None or contains the background color. Again: this may or may not be (completey) visible because it may be covered.