Jachim Coudenys’ talk really delivered on its title. He started by laying out some of the groundworks. An important aspect is that PHP-FPM uses a master process which spawns child processes. These child processes can all access the master’s shared memory. 

Then the first part: Realpath Cache. This cache is filled whenever a path is requested in PHP. It explodes a path on the directory separator (‘/’) and then caches all the possible paths.

$presentation = file_get_contents( __DIR__ . '/../ffi.php' ); 
$presentation2 = file_get_contents( dirname(__DIR__) . '/ffi.php' );
$filesize = filesize( __DIR__ . '/../est.txt' ); print_r(realpath_cache_get());
[/home/jachim/demo] => Array
(
    [key] => 1.6354972010384E+19
    [is_dir] => 1
    [realpath] => /home/jachim/demo
    [expires] => 1579859105
)
[/home] => Array
(
    [key] => 4353355791257440477
    [is_dir] => 1
    [realpath] => /home
    [expires] => 1579859105
)
[/home/jachim] => Array
(
    [key] => 5522554812971572568
    [is_dir] => 1
    [realpath] => /home/jachim
    [expires] => 1579859105
)
[/home/jachim/demo /../ffi.php] => Array
(
    [key] => 1.6164035761241E+19
    [is_dir] =>
    [realpath] => /home/jachim/ffi.php
    [expires] => 1579859105
)
[/home/jachim/ffi.php] => Array
(
    [key] => 5100116734180765326
    [is_dir] =>
    [realpath] => /home/jachim/ffi.php
    [expires] => 1579859105
)
[/home/jachim/demo/realpath.php] => Array
(
    [key] => 1.8190176096283E+19
    [is_dir] =>
    [realpath] => /home/jachim/demo/realpath.php
    [expires] => 1579859105
)

Obviously, subsequent calls resolve faster, but this mechanism doesn’t use shared memory. An performance boost can be achieved by tweaking the cache size and the entries TTL.

Next up, OPCache: opcodes are cached since PHP 5.5 by default. Each time a request is handled by PHP, it compiles the code into opcodes (short for operation codes). A cool tool Jachim showed, was the Vulcan Logic Dumper which visualizes these opcodes. Since these opcodes never change (in production), it makes sense to add some cache. There were several tools available, but it was the Zend Optimizer, which was donated by Zend and included in PHP 5.5, ànd it has become better with each release. This cache is in the shared memory and thus used by each child of the master FPM process. 

A few important aspects:

  • Shared memory: as explained above, it is shared between the child processes, but this also means that it is only useful when there are child processes, so perhaps you should think about some workarounds to keep the master processes running, and thus keeping the cache warm (although there are other OPCaching priming tricks, like saving the opcodes to file and loading those in memory when spinning up the master process, FPM pools etc.).
  • Wasted memory: OPCache doesn’t do “defragmentation”. This means that if some cache entries get invalidated, the memory is marked as wasted, but it is not released. New entries are always appended. This may lead to suboptimal usage of the available memory.

There are several options to tweak the config: 

  • It’s off by default for CLI commands, since it doesn’t make sense to cache it since the process is immediately terminated. However, in some cases (think daemons) just turning it on will already give you a better performance.  
  • Another obvious win is to increase the allowed memory usage, but it depends on what the stats (opcache_get_status()) tell you about the current cache usage. 
  • Tweaking when the cache gets invalidated is also an option. Every revalidate_freq seconds, OPCache will check for updated scripts. You can increase this value or disable the validate_timestamps setting altogether, meaning the cache is valid until eternity (and manual invalidation is required). 
  • The opcache.max_wasted_percentage is the threshold to determine when a restart should occur to free up (wasted) memory.

Jachim concluded that your OPCache’s hit rate should always be at least above 90%, but actually in most cases you should see a hit rate of 99%. The wasted memory should ideally be zero, and you should never have a full cache. A cool visualisation tool is OPcache Status.

The final part, Preloading: OPCache on steroids. It’s part of OPCache since PHP 7.4, and basically it means that some of your function and classes can be loaded when PHP starts, so before it accepts any requests. Actually, your functions becomes part of the PHP engine, just like for example strlen(). There’s one pitfall: your classes should be loaded “in order”, i.e. if your class needs another class it should already be known. 

Conclusion

Even though this might seem like devops-matter, and it is indeed very low level, but it sure is interesting to know how this all works, and it definitely made our team think again about some of our hosting configurations. 

Slides: https://speakerdeck.com/coudenysj/php-opcache-realpath-cache-and-preloading-phpbenelux-conference-2020

Related

This article is part of the PHPBNL20 blogs

More insights

Cross-platform applicaties with React Native

Never before has developing native mobile applications been as accessible as it is today. At Codana, we do this by using the React Native, an open-source framework developed by Meta.

Author: Jinse Camps
Architect | Analyst
Jinse Camps
dev

Laracon EU 2024

A fantastic learning experience to inspire and be inspired together with a lot of other Laravel passionate people! Something we couldn't miss and very much connect with the community. What a top event! Who will we see next editions? 😮

Author: Noah Gillard
PHP / Laravel Developer
Noah Gillard AI generated Face
laracon codana persoon

An efficient tourism data management system

A TDMS or Tourist Data Management System, is simply a platform that retrieves data from various sources, processes it internally either automatically or not, and offers this data back to external platforms.

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
laptop

Tourism Data Management Systems

In dit artikel verkennen we wat een TDMS is, waarom het essentieel is voor de toerisme-industrie, en hoe technologieën zoals Laravel en ElasticSearch het verschil kunnen maken. 

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
tdms

The difference between data management and data processing in a digital economy

Gegevens zijn cruciaal voor bedrijven en het begrijpen van de verschillen tussen gegevensbeheer en gegevensverwerking kan verwarrend zijn. In dit artikel zullen we deze verschillen in de digitale economie nader bekijken om hun doelen en toepassingen beter te begrijpen.

Author: Tom Van den Eynden
Web Architect | Coordinator
Tom Van den Eynden
gegevensverwerking

Test Driven Development - application to a project

TDD, or in full Test Driven Development, is an approach to development where we start from writing tests.

Author: Sarah Jehin
PHP developer
Sarah Jehin
development