In this blog post we will make a comparison between some PDF solutions we use in our applications. The reason for this post was a project that underwent a major update and that included an update of our PDF library. We decided to switch from wkhtmltopdf to AthenaPdf. This for several reasons including inconsistent rendering PDF files on different servers, but more on that later.

wkhtmltopdf

wkhtmltopdf is a command line tool that allows you to easily convert html to a PDF file. The first version appeared more than 10 years ago and since then wkhtmltopdf has become a widely used solution to convert html to PDFs.

 

Pro's

  • Directly accepts html strings, html files or urls
  • Option to include header and footer
  • Some basic options like generating a "table of contents" based on h-tags.

Con's

  • Vereist installatie van de library op de server
  • Outdated: not all new HTML5 and css features are nicely supported
  • Inconsistent rendering: some files were rendered differently locally vs. on staging vs. on production
  • Many users experience different issues with this library, which can be seen by the many open issues in their git repo (+1000) (https://github.com/wkhtmltopdf/wkhtmltopdf/issues?page=1&q=is%3Aissue+is%3Aopen)

Setup

To use, the binary needs to be installed on your local system and the servers on which you want to use wkhtmltopdf. (https://wkhtmltopdf.org/downloads.html)

After that, you can use a PHP-wrapper Addressing the binary in code:

composer require mikehaertl/phpwkhtmltopdf

How to use

In its simplest form, we can convert an html file to a PDF in the following way:

use mikehaertl\wkhtmlto\Pdf;

$pdf = new Pdf('/path/to/page.html');
$pdf->saveAs($('/path/to/page.pdf);

 

However, we can also include additional options, including the location of the binary, print in landscape, use utf-8 encoding, ... And also add the html pages separately

 

    use mikehaertl\wkhtmlto\Pdf;


    $pdf = new Pdf([
      'ignoreWarnings' => TRUE,
      'binary' => '/usr/local/bin/wkhtmltopdf',
    ]);
    $pdf->getCommand()->useExec = TRUE;
    $pdf->addPage(
      '<html><head lang="en">
            <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>My wkhtmltopdf test</title>
    </head><body>' . $myHtml . '</body></html>'
    );
    $pdf->getCommand()->setArgs('-O landscape --encoding utf-8');
    $filename = strtotime('now') . '.pdf';
    $path = 'pdf/' . $filename;
    $pdf->saveAs($path);

AthenaPDF

Athena is an open source dockerized PDF solution that is fairly simple to set up in projects already working with docker. You can convert html to PDF via command line as well as url.

Pro’s

  • Docker powered
  • Supports all necessary html elements
  • Quick to use

Con’s

  • Not very much freedom (you can't define custom headers etc. like you can with TCPDF)

Setup

To get started, add the docker container to your project. In this case, we chose the service that you can call from PHP. Choose a WEAVER_AUTH_KEY. Note that this must also be known in your php container.

#docker-compose.yml
athenapdf:
  image: arachnysdocker/athenapdf-service # belangrijk is om de microservice in te stellen en niet de CLI
  environment:
      - WEAVER_AUTH_KEY=${WEAVER_AUTH_KEY}
  ports:
    - '8080:8080'
  labels:
    - "traefik.enable=false"
  networks:
    - default
php:
  image: <your-php-image>
  environment:
      - WEAVER_AUTH_KEY=${WEAVER_AUTH_KEY}

How to use

Usage is very simple, you call the pdf service with the same name as your container:

 http://athenapdf:8080/convert?auth={$weaverKey}&url=http://www.google.com

It is important to give your weaver key as well as the url from which you want to convert the html.  This can be an external or internal url. If you are working entirely with docker, you can address an internal route through the www container:

http://www/your_route

If your HTML is not available directly via URL, one option is to provide a general "serve-html" endpoint that will compile the appropriate html based on the passed parameters.  An example of this is that you want to export an overview to PDF without exporting the rest of the page (header, footer, ...).
Another option is to post your HTML to the athena endpoint and append "&ext=html" after your weaver key.

Once you have compiled the request, you can easily store the response (the generated PDF file) on your server with the copy command. For example:

copy('http://athenapdf:8080/convert?auth={$weaverKey}&url=http://www.google.com', $locationWhereYouWantToSaveThePDF);

Then you can access and download this file from that location.

 

Things to keep in mind

  • AthenaPDF is smart about tables, but in some cases this can just lead to unwanted layout.
    Athena recognizes table headers and when your table runs across multiple pages, Athena will automatically repeat the table headers at the top of each page.  This can be convenient, but is certainly not always desirable. To avoid such behavior, you should convert the "th's" to "td's".
  • There is no default option to indicate that you want to print the PDF in landscape format, but this is easily handled via css:
    @media print{
      @page {
        size: landscape;
      }
    }

TCPDF

If you need an extensive PHP library, then TCPDF the tool for you. By extension, you can combine it with FPDI to enable merging of PDF files as well.

Where previous tools send you back the generated PDF, here you have everything in your hands: custom header or footer, extra fields, images, ... Whatever you want, (almost) everything is possible.

For most frameworks, there are already packages built around TCPDF, think of QipsiusTCPDFBundle for Symfony or​​​​​​​ elibyy/tcpdf-laravel for Laravel.  However, TCPDF is also perfectly usable without these wrappers.

It suffices to simply execute "composer require tecnickcom/tcpdf". After this you can easily extend the basic file and overwrite it with the desired items, such as a custom header and footer.

Pro’s

Con’s

  • Requires some knowledge to generate PDFs correctly
  • Current version is no longer maintained as they also state on their git page. A new version is being worked on though (https://github.com/tecnickcom/tc-lib-pdf)

Setup

As mentioned earlier, it is sufficient to execute a simple composer require:

composer require tecnickcom/tcpdf

How to use

How you will use TCPDF depends greatly on the purpose and complexity of your PDF generation.
A simple example of how to use this library is as follows:

// Create
$myFile = new \TCPDF();
$myFile->SetAuthor('John Doe');
// Set content
$myFile->AddPage();
$content = '<div>This is a test</div>';
$myFile->writeHTML($content);
// Save
$myFile->Output('my_first_pdf_file.pdf', 'F');

But as mentioned earlier, we use this library for more complex cases, because the above example can be rendered just as easily with wkhtmltopdf or AthenaPDF.

Since the possibilities of TCPDF are very extensive, we only mention a few examples that were important for our application.

To start with, we extended the TCPDF class and adapted it to our needs. This allowed us to use, among others:

  1. the FpdiTrait
  2. custom footer and headers add
  3. custom fields add

You'll read more about these below.


Extending the class:

final class ContractPdf extends \TCPDF

1) FpdiTrait to merge files

One functionality that was required in the application we built was to merge a self-composed PDF with uploaded attachments.
There are a few possible ways to handle this, but we used the method below.

First we included the FpdiTrait in our custom TCPDF class in order to make use of the necessary merge functionalities.

use FpdiTrait;

Next, we elaborate on the effective merge functionality.

We intialize a TCPDF object and add the base contract to it.
Then we add all the desired attachments.
Finally, the merged PDF can be saved on the server.

// Initialize TCPDF (which uses the FpdiTrait)
$this->file = new ContractPdf();
// Add our custom created contract PDF file
$this->addFile('contract.pdf');
// Add all attachments
$attachments = ['attachment1.pdf', 'attachment2.pdf'];
foreach ($attachments as $attachment) {
    $this->addFile($attachment);
}
// Save our merged PDF file
$this->file->Output('mergedPDF.pdf', 'F');

The addFile function called above is constructed as follows:

public function addFile($filename)
{
    $pageCount = $this->file->setSourceFile($filename);
    $i = 1;
    // Loop over all the pages in de given PDF
    while ($i <= $pageCount) {
        $pageId = $this->file->importPage($i, PageBoundaries::MEDIA_BOX);
        $this->file->AddPage();
        // Import the page in our final file
        $this->file->useImportedPage($pageId);
        $i++;
    }
}

Important note:

The standard FPDI parser can only handle PDFs up to and including version 1.4. This is because the compression features are different in version 1.5 and higher. You can read more about that here: https://www.setasign.com/products/fpdi-pdf-parser/details/

Setasign offers a commercial add-on to merge PDFs with higher versions.
However, it is also possible to convert the PDFs to version 1.4 before merging. This conversion can be achieved with, for example Ghostscript.


2) Custom headers and footers

The header and footer functions can be overwritten in your custom extended class so that you can render the header the way you need to.  For us it was important to be able to specify through options whether the header should be printed on every page or only on the first page. In addition, the alignment of the logo (left/right) had to be determined dynamically.

public function Header() {
    // Only print header if it may be printed on each page, or if we are on the first page
    // If on each page, only on contract pages itself, not on attachment pages
    $printHeader = ($this->headerData['header_on_each'] && $this->page <= $this->headerData['contract_num_pages']) || $this->page === 1;
    if($printHeader) {
        $logoWidth = 50;
        $logoHeight = 50;
        // Check which type of alignment is chosen and place header data accordingly
        switch ($this->headerData['align']) {
            case 'textleft':
                $this->writeHTMLCell(100,50, 10,10, $this->headerData['text']);
                $this->Image($this->headerData['image'], 150, 10, $logoWidth, $logoHeight, $this->headerData['image_extension'], ...);
                break;
            case 'textright':
                $this->writeHTMLCell(100,50, 95,10, $this->headerData['text']);
                $this->Image($this->headerData['image'], 10, 10, $logoWidth, $logoHeight, $this->headerData['image_extension'], ...);
                break;
        }
    }
}

3) Custom Fields add

In addition, there are so many more possibilities with TCPDF, including automatically adding "table of content" pages, providing background images for pages, building in protection, digital signature certification and so much more.

Conclusion

Depending on how you want to generate PDFs, some solutions lend themselves better than others. 

Even though wkhtmltopdf was one of the more popular converters, I would not recommend it due to its outdated code and inconsistency in rendering. There are several alternatives available in the meantime such as AthenaPDF among others, but during research for this blog article I also came across things like Puppeteer etc.

However, if you want to build more complex PDFs and need a lot of customization, then a solution like TCPDF is ideal.

Author: Sarah Jehin
PHP developer
Sarah Jehin

More insights

Cross-platform applicaties with React Native

Never before has developing native mobile applications been as accessible as it is today. At Codana, we do this by using the React Native, an open-source framework developed by Meta.

Author: Jinse Camps
Architect | Analyst
Jinse Camps
dev

Laracon EU 2024

A fantastic learning experience to inspire and be inspired together with a lot of other Laravel passionate people! Something we couldn't miss and very much connect with the community. What a top event! Who will we see next editions? 😮

Author: Noah Gillard
PHP / Laravel Developer
Noah Gillard AI generated Face
laracon codana persoon

An efficient tourism data management system

A TDMS or Tourist Data Management System, is simply a platform that retrieves data from various sources, processes it internally either automatically or not, and offers this data back to external platforms.

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
laptop

Tourism Data Management Systems

In dit artikel verkennen we wat een TDMS is, waarom het essentieel is voor de toerisme-industrie, en hoe technologieën zoals Laravel en ElasticSearch het verschil kunnen maken. 

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
tdms

The difference between data management and data processing in a digital economy

Gegevens zijn cruciaal voor bedrijven en het begrijpen van de verschillen tussen gegevensbeheer en gegevensverwerking kan verwarrend zijn. In dit artikel zullen we deze verschillen in de digitale economie nader bekijken om hun doelen en toepassingen beter te begrijpen.

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
gegevensverwerking

Test Driven Development - application to a project

TDD, or in full Test Driven Development, is an approach to development where we start from writing tests.

Author: Sarah Jehin
PHP developer
Sarah Jehin
development