• Pricing
  1. Home
  2. PDF Converter
  3. How to Convert PDF Documents
  4. HTML PDF Conversion: Formatting Issues & How To Resolve Them
windows-default-smallpdf

HTML PDF Conversion: Formatting Issues & How To Resolve Them

by David Beníček

Convert HTML to PDF or PDF to HTML without losing formatting—fix broken layouts and missing styles. Here’s how to keep your documents perfect.

Converting between HTML and PDF is a little like translating between two languages using a translation app. Sometimes it works really well, and other times you're left having to fix a literal mess.

The reason for this is that HTML and PDF handle formatting completely differently. HTML is fluid, designed to adjust to different screen sizes and devices. Change the screen size, and the layout might shift or change quite noticeably.

PDF, on the other hand, is fixed—everything stays exactly where it was placed. Change the screen size and everything might look a little smaller or larger, but nothing moves about. This difference can lead to:

  • Styles not applying properly.
  • Page breaks happening in the wrong spots.
  • Fonts being replaced or missing.
  • Tables losing their structure.
  • The alignment of images shifting.

It helps to know why this happens, because knowing the cause of the trouble means you can possibly fix it.

Converting HTML to PDF Without Losing Formatting

A well-structured HTML document is the first step to getting a PDF that looks right. Some of this isn’t easy, but if you know HTML—or have messed about with it a bit—you’ll be able to check and adjust the structure of your document before saving it as a PDF.

  • Use proper formatting tags for headings, paragraphs, and tables.
  • Inline CSS often works better than external stylesheets for PDF conversion since some converters ignore linked styles.
  • Specify exact dimensions for images and tables to prevent layout shifts.

Handling Page Breaks in Multipage PDFs

A common issue when converting from HTML is for PDFs to split over pages in weird places. Webpages are meant to be viewed on screens, so HTML doesn’t consider page or paper size the way Word does. To control where pages break when saving HTML as a PDF, use CSS:

page-break-before: always; OR page-break-after: always;

Either of these properties tell the converter exactly where to create new pages, keeping your document readable.

Ensuring Fonts and Spacing Stay Consistent

Fonts are another thing that can get messed up during conversion. If a font isn’t embedded, the converter replaces it with a default, which might ruin your layout, especially if it isn’t a standard font.

Use web-safe fonts like Arial, Verdana, Helvetica, Times New Roman, or Garamond to avoid substitution. If you must use a custom font, make sure it’s embedded in the HTML using @font-face. And then watch out for extra spaces or line height changes—you can control this by defining line-height and margin explicitly in CSS.

How to Fix HTML to PDF Formatting Issues

It’s very rare to save a webpage as a PDF and not end up with some formatting issues. And despite what some apps and browser extensions promise, there is no simple solution to prevent this.

What you can do is find an easy way to fix formatting issues.

Start by converting your HTML files or webpages to PDF. Then, head to Smallpdf’s PDF Converter:

  1. Upload your document to the PDF Converter.
  2. Choose to convert it to a Word (.docx) document.
  3. Select “Convert selectable text only,” and wait for the process to finish.
  4. Download your newly converted Word document!

Next, open your document using Microsoft Word—or Google Docs, LibreOffice, or Pages. There’s a good chance the document will have headers and footers showing the original URL, page numbers, and date and time the file was saved. Delete these the same way you normally remove headers and footers.

Next, delete all unnecessary elements, like the site banner, menus, ads, and other promotional content. Leave only the content you need.

Now you can work through the document, fixing margins, font type and size, headings, padding between paragraphs and headings, and anything else that doesn’t look right. You might also need to reposition and resize some images.

Once satisfied, save the document and export it as a PDF again.

Converting PDF to HTML Without Losing Formatting

There are plenty of ways to save a PDF as an HTML file, but if you're planning to publish it as a web page, most quick methods won’t cut it. Simply exporting a PDF as HTML or using a basic online converter often leaves you with messy, unstructured code that isn’t web-friendly.

To make things trickier, different tools give wildly different results. Some might keep most of your formatting intact, while others ignore it completely. Unlike converting HTML to PDF, fixing formatting issues in HTML can be a real headache—especially if you’re not familiar with coding.

Before going down the conversion rabbit hole, ask yourself if the PDF really needs to be turned into HTML. Could you just link to it or embed it on your site instead? If those options won’t work, manually recreating the PDF in HTML is the best bet—but if you’ve never built a webpage from scratch, it’s worth getting a web developer to handle it.

Troubleshooting HTML-PDF Conversion Problems

How do I fix HTML in a PDF after conversion?

Once you’ve converted an HTML file to a PDF, there’s no actual HTML left in the document—it’s now a fixed format. That means you can’t edit the original HTML structure anymore. The easiest way to fix formatting issues after conversion is to first convert the PDF to a Word document. Microsoft Word gives you more control over spacing, fonts, and alignment, making it much easier to clean up the layout. Once everything looks right, you can save it as a PDF again.

Why are there errors when converting a PDF to HTML?

PDFs use a fixed structure, while HTML is flexible. Some layouts don’t transfer well unless you manually recreate the document using properly structured HTML.

My PDF to HTML conversion looks completely distorted! What should I do?

Check if the original PDF is text-based or an image. If it’s an image, you may need OCR (Optical Character Recognition) to extract the text correctly. Also, try converting a single page first to see where things go wrong. If the HTML output is a mess, breaking it down into sections and styling them separately can help. Remember that converting a PDF to HTML is never going to give perfect results without a lot of manual editing of the HTML.

How can I convert a PDF to HTML while keeping the formatting?

No PDF to HTML conversion is going to give you a file with identical formatting. The only way to get close to the original formatting would be to extensively edit the HTML source code and CSS file if one is created during the conversion.

I'm having issues converting PDF to HTML with Calibre—how can I fix this?

Calibre struggles with complex layouts. Try experimenting with different output settings, like tweaking the heuristic processing options. You’ll never be able to get the layout exactly right, but you might get it looking better.

How do I save a webpage as a PDF without losing formatting?

Use your browser’s “Save as PDF” feature in the print dialog. This won’t give perfect results, especially on complex pages, but you can always tidy up the formatting by converting the PDF to Word. In Word you can adjust the font, font size, headings, padding, and margins, before saving the document as a PDF again.

Getting perfect results with HTML and PDF conversions isn’t possible without some tweaks. But once you know what to expect, and how to fix things, the process can be less frustrating.

David Beníček – Product & Engineering Manager
David Beníček
Product & Engineering Manager @Smallpdf