How to convert HTML Content into PDF Document using Python

Introduction

In my previous tutorial I had shown how to create PDF document using Python 3 but, here we will see how to convert HTML content into PDF document. Sometimes it may be good idea to directly write HTML content into PDF file to print the texts mixing with different styles. In this example I am going to write Italic Text, Bold Text, Underlined Text, Hyper Link with Image, Paragraph, Paragraph with Bold Text, Ordered List, Unordered List, Definition List, Code Block and Table.

Here I will assign a string of HTML content into a variable and write the variable to the output PDF document using Python 3. I am going to use the same library fpdf which I used to create PDF file earlier.

Prerequisites

Python 3.8.0, fpdf (pip install fpdf)

Convert HTML to PDF

Here we will create a Python script with the following source code to convert the HTML text into PDF document.

In the below source code I have assigned HTML content into a variable called html.

Next we include the required libraries fpdf and HTMLMixin into the script.

The pass statement in Python is used when a statement is required syntactically but you do not want any command or code to execute. The pass statement is a null operation; nothing happens when it executes.

html = """
<h1 align="center">HTML to PDF</h1>
<h2>An exmaple to convert HTML to PDF</h2>
<p>You can now easily print text mixing different
styles : <B>bold</B>, <I>italic</I>, <U>underlined</U>, or
<B><I><U>all at once</U></I></B>!<BR>You can also insert links
on text, such as <A HREF="https://pypi.org/project/fpdf/">www.pypi.org</A>,
or on an image: click on the logo image.<br>
<center>
<a href="http://www.pypi.org"><img src="logo.png" width="104" height="71"></a>
</center>

<h2>Paragraph</h2>

<p>Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus</p>

<h2>Paragraph with Bold Text</h2>

<p><B>Pellentesque habitant morbi tristique</B> senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. <em>Aenean ultricies mi vitae est.</em> Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, <code>commodo vitae</code>, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. <a href="#">Donec non enim</a> in turpis pulvinar facilisis. Ut felis.</p>

<h2>Ordered List</h2>

<ol>
   <li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
   <li>Aliquam tincidunt mauris eu risus.</li>
</ol>

<blockquote><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus magna. Cras in mi at felis aliquet congue. Ut a est eget ligula molestie gravida. Curabitur massa. Donec eleifend, libero at sagittis mollis, tellus est malesuada tellus, at luctus turpis elit sit amet quam. Vivamus pretium ornare est.</p></blockquote>

<h3>Unordered List</h3>

<ul>
   <li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
   <li>Aliquam tincidunt mauris eu risus.</li>
</ul>

<h2>Code Block</h2>

<pre><code>
#header h1 a {
  display: block;
  width: 300px;
  height: 80px;
}
</code></pre>

<h2>Definition List</h2>

<dl>
   <dt>Definition list</dt>
   <dd>Consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.</dd>
   <dt>Lorem ipsum dolor sit amet</dt>
   <dd>Consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.</dd>
</dl>

<h2>Table</h2>

<table border="0" align="center" width="50%">
<thead><tr><th width="30%">Header 1</th><th width="70%">header 2</th></tr></thead>
<tbody>
<tr><td>cell 1</td><td>cell 2</td></tr>
<tr><td>cell 2</td><td>cell 3</td></tr>
</tbody>
</table>
"""

from fpdf import FPDF, HTMLMixin

class MyFPDF(FPDF, HTMLMixin):
	pass

pdf = MyFPDF()
#First page
pdf.add_page()
pdf.write_html(html)
pdf.output('html2pdf.pdf', 'F')

Testing the Program

Executing the above Python script will give you the following PDF file with the content as shown in the below image:

convert html content to pdf document using python

Clicking on the link or logo will take you to the corresponding website.

That’s all about converting HTML content into PDF document.

Source Code

Download

Thanks for reading.

2 thoughts on “How to convert HTML Content into PDF Document using Python

  1. What if the image reference in the HTML is a base64 image? So far, my attempts have resulted in “FPDF error: Unsupported image type”.

  2. Hello, hope you are doing well. nice tutorial. Thank you for the tutorial. What if the HTML is in a file, how can it be called here?

Leave a Reply

Your email address will not be published. Required fields are marked *