I have the following code to convert PDF files into Markdown format, but the hyperlinks present in the PDF are not being preserved or converted correctly into Markdown.
This feature is not yet supported in PyMuPDF4LLM. Links are a little bit tricky to deal with as they can be internal (linking to other areas of the doc) or external (a website). Many links that you get from Page - PyMuPDF documentation will be invisible rects overlayed on areas of a doc too. For obvious text which is a website kink, you could use maybe post-process the resulting MD and look for any obvious inline text websites. e.g. if the text body contains https:// then figure out how to wrap that with correct markdown.