What is PdfItDown?

PdfItDown is a python package that relies on markitdown by Microsoft, markdown_pdf and img2pdf to carry on conversion of text-based files and images to PDF. PdfItDown is applicable to the following file formats:

  • Markdown
  • PowerPoint
  • Word
  • Excel
  • HTML
  • Text-based formats (CSV, XML, JSON)
  • ZIP files (iterates over contents)
  • Image files (PNG, JPG)

Setting up

To set uo PdfItDown, it is good practice to create an isolated development environment

python3 -m venv .venv
source .venv/bin/activate
pip install pdfitdown

Choose how to use it

Once you have PdfItDown set up, you can choose how to use it: