Making Drupal 8 fly

In my travels to talk about Drupal, everyone asks me about Drupal 8's performance and scalability. Modern websites are much more dynamic and interactive than 10 years ago, making it more difficult to build modern sites while also being fast. It made me realize that maybe I should write up a summary of some of the most exciting performance and scalability improvements in Drupal 8. After all, Drupal 8 will leapfrog many of its competitors in terms of how to architect and scale modern web applications. Many of these improvements benefit both small and large websites, but also allow us to build even bigger websites with Drupal.

More precise cache invalidation

One of the strategies we employ in making Drupal fast is "caching". This means we try to generate pages or page elements one time and then store them so future requests for those pages or page elements can be served faster. If an item is already cached, we can simply grab it without going through the building process again (known as a "cache hit"). Drupal stores each cache item in a "cache bin" (a database table, Memcache object, or whatever else is appropriate for the cache backend in use).

In Drupal 7 and before, when one of these cache items changes and it needs to be re-generated and re-stored (the cache gets "invalidated"), you can only delete a specific cache item, clear an entire cache bin, or use prefix-based invalidation. None of these three methods allow you to invalidate all cache items that contain data of, say, user 200. The only method that is going to suffice is clearing the entire cache bin, and this means that usually we invalidate way too much, resulting in poor cache hit ratios and wasted effort rebuilding cache items that haven't actually changed.

This problem is solved in Drupal 8 thanks to the concept of "cache tags": each cache item can have any number of cache tags. A cache tag is a compact string that describes the object being cached. Thanks to this extra metadata, we can now delete all cache items that use the user:200 cache tag, for example. This means we've deleted all the cache items we must delete, but not a single one more: optimal cache invalidation!

Drupal cache tags

Example cache tags for different cache IDs.

And don't worry, we also made sure to expose the cache tags to reverse proxies, so that efficient and accurate invalidation can happen throughout a site's entire delivery architecture.

More precise cache variation

While accurate cache invalidation makes caching more efficient, there is more we did to improve Drupal's caching. We also make sure that cached items are optimally varied. If you vary too much, duplicate cache entries will exist with the exact same content, resulting in inefficient usage of caches (low cache hit ratios). For example, we don't want a piece of content to be cached per user if it is the same for many users. If you vary too little, users might see incorrect content as two different cache entries might collide. In other words, you don't want to vary too much nor too little.

In Drupal 7 and before, it's easy to program any cached item to vary by user, by user role, and/or by page, and could even be configured through the UI for blocks. However, more targeted variations (such as by language, by country, or by content access permissions) were more difficult to program and not typically exposed in a configuration UI.

In Drupal 8, we introduced a Cache Context API to allow developers and site builders to express these variations and to make them automatically available in the configuration UI.

Drupal cache contexts

Server-side dynamic content substitution

Usually a page can be cached almost entirely except for a few dynamic elements. Often a page served to two different authenticated users looks identical except for a small "Welcome $name!" and perhaps their profile picture. In Drupal 7, this small personalization breaks the cacheability of the entire page (or rather, requires a cache context that's way too granular). Most parts of the page, like the header, the footer and certain blocks in the sidebars don't change often nor vary for each user, so why should you regenerate all those parts at every request?

In Drupal 8, thanks to the addition of #post_render_cache, that is no longer the case. Drupal 8 can render the entire page with some placeholder HTML for the name and profile picture. That page can then be cached. When Drupal has to serve that page to an authenticated user, it will retrieve it from the cache, and just before sending the HTML response to the client, it will substitute the placeholders with the dynamically rendered bits. This means we can avoid having to render the page over and over again, which is the expensive part, and only render those bits that need to be generated dynamically!

Client-side dynamic content substitution

Some things that Drupal has been rendering for the better part of a decade, such as the "new" and "updated" markers on comments, have always been rendered on the server. That is not ideal because these markers are different for every visitor and as a result, it makes caching pages with comments difficult.

The just-in-time substitution of placeholders with dynamic elements that #post_render_cache provides us can help address this. In some cases, as is the case with the comment markers, we can even do better and offload more work from the server to the client. In the case for comment markers, a certain comment is posted at a certain time — that doesn't vary per user. By embedding the comment timestamps as metadata in the DOM with a data-comment-timestamp="1424286665" attribute, we enable client-side JavaScript to render the comment markers, by fetching (and caching on the client side) the “last read" timestamp for the current user and simply comparing these numbers. Drupal 8 provides some framework code and API to make this easy.

A "Facebook BigPipe" render pipeline

With Drupal 8, we're very close to taking the client-side dynamic content substitution a step further, just like some of the world's largest dynamic websites do. Facebook has 1.35 billion monthly active users all requesting dynamic content, so why not learn from them?

The traditional page serving model has not kept up with the increase of highly personalized websites where different content is served to different users. In the traditional model, such as Drupal 7, the entire page is generated before it is sent to the browser: while Drupal is generating a page, the browser is idle and wasting its cycles doing nothing. When Drupal finishes generating the page and sends it to the browser, the browser kicks into action, and the web server is idle. In the case of Facebook, they use BigPipe. BigPipe delivers pages asynchronously instead; it parallelizes browser rendering and server processing. Instead of waiting for the entire page to be generated, BigPipe immediately sends a page skeleton to the the client so it can start rendering that. Then the remaining content elements are requested and injected into their correct place. From the user's perspective the page is rendered progressively. The initial page content becomes visible much earlier, which improves the perceived speed of the site.

We've made significant improvements to the way Drupal 8 renders pages (presentation). By default, Drupal 8 core still implements the traditional model of assembling these pieces into a complete page in a single server-side request, but the independence of each piece and the architecture of the new rendering pipeline enable different “render strategies" to be experimented with — different methods for dynamic content assembly, such as BigPipe, Edge Side Includes, or other ideas for making the most optimal use of client, server, content delivery networks and reverse proxies. In all those examples, the idea is that we can send the primary content first so the client can start rendering that. Then we send the remaining Drupal blocks, such as the navigation menu or a 'Related articles' block, and have the browser, content delivery network or reverse proxy assemble or combine these blocks into a page.

Drupal render pipeline

A snapshot of the Drupal 8 render pipeline diagram that highlights where alternative render strategies can be implemented.

Some early experiments by Wim Leers in Acquia's OCTO show that we can improve performance by a factor of about 2 compared to a recent Drupal 8 development snapshot. These breakthroughs are enabled by leveraging the various improvements we made to Drupal 8.

And much more

But that is not all. The Drupal community has actually done much more, including: complete asset dependency information (which allowed us to ensure zero JavaScript is loaded by default for anonymous users and send less data on AJAX requests), pluggable CSS/JS aggregation and minification (to support more optimal optimization algorithms), and more. We've also made sure Drupal 8 is fast by default, by having better defaults: CSS/JS aggregation enabled, JS assets being loaded from the bottom, block caching enabled, and so on.

All in all, there is a lot to look forward to in Drupal 8!

Special thanks to Acquia's Wim Leers, Alex Bronstein and Angie Byron for their contributions to this blog post.

Comments

Roger Nyman (not verified):

Thanks Dries. Very interesting stuff about caching in D8. The links provided are also really good resources.

February 19, 2015
Cameron Tod (not verified):

Wow - fantastic work going on in D8 as always. I'm excited to see what happens with SmartCache!

February 19, 2015
Niels van Mourik (not verified):

Great write-up, so nice to see all of the huge improvements in one overview! Also wanted to point out that we're well underway in supporting the base platform for external cache invalidation with the Purge module for Drupal 8, this allows us to also invalidate by tags on Varnish and other reverse caching systems. See https://github.com/nielsvm/purge for more info!

February 20, 2015
Fabianx (not verified):

Great post!

If you want to try BigPipe on Drupal 7 today, you can test the prototype implementation in the http://drupal.org/project/render_cache module called render_cache_big_pipe.

Render_cache itself shares a lot of the code with Drupal 8 and there was a lot of ideas flowing back and forth between Drupal 8's caching systems and the render_cache module.

We (Wim, me, Marco) presented this work and how to leverage e.g. cache contexts in Drupal 7 here for Varnish for example:

https://amsterdam2014.drupal.org/session/render-caching-drupal-7-and-8

This also has a short demo of the Big Pipe implementation and shows how placeholders could be dynamically generated based on those cache contexts.

While it is possible to do a lot of things in Drupal 7 already now, many things are so much easier in Drupal 8 thanks to its clean architecture, its well designed APIs and its overall service based infrastructure.

Come join us for the end stretch here:

https://www.drupal.org/node/2429287

and share your comments on the Google Doc here:

https://docs.google.com/a/tag1consulting.com/document/d/1Gw7ohBOUKu38t4k...

February 20, 2015
Ashwini Kumar (not verified):

Informative and beautiful writeup, Dries!

February 20, 2015
Matt Harrold (not verified):

I've fed my family and kept my clients happy for the last 5 years thanks to your leadership (and the Drupal communities) work on Drupal. I look forward to reinventing each website (currently in Drupal 7), using the possibilities presented in Drupal 8. Well done.

February 21, 2015
EdgarPE (not verified):

This is great news, I am super happy :)

I think there should be a "Minimum age" setting besides "Maximum age" setting. If a site is very popular it can cache a content for a short time (30 sec, or 1 min) even if that content is alredy outdated.

It helps on traffic spikes too, and it is also very useful, if for some reason some content is very hard to generate.

February 21, 2015
decibel.places (not verified):

One way to speed up loading: minimize the amount of content, blocks, menus etc. Google rewards pages in search rankings that just serve the content that's expected - sort of like a minimal mobile responsive look. Do you really need that sign up block for the email list, or your Twitter history on every page? Probably not.

February 25, 2015
Wim Leers (not verified):

Some early experiments by Wim Leers in Acquia's OCTO show that we can improve performance by a factor of about 2 compared to a recent Drupal 8 development snapshot.

That is no longer an experiment: it ships with Drupal 8, see http://wimleers.com/article/drupal-8-dynamic-page-cache.

October 13, 2015
Daniel Kanchev (not verified):

Hi Dries and thanks for the post. I have a question which is related to the cache storage in Drupal 8. By default the cached data is stored in the database which makes sense is most situations. Is there an easy way to change this in order to store the cached data separately. I am interested in the following cache storage options:

- Regular cache files on the server disks
- tmpfs
- Redis/Memcached

I am asking because I'm working on a Drupal site that uses many servers that are part of a single infrastructure and I would like to explore the different options. For example, if the MySQL database is hosted on 3 separate servers in different locations I don't want to generate traffic between the web servers and the database servers just to obtain cached data.

November 23, 2015
Wim Leers (not verified):

When Drupal finishes generating the page and sends it to the browser, the browser kicks into action, and the web server is idle. In the case of Facebook, they use BigPipe. BigPipe delivers pages asynchronously instead; it parallelizes browser rendering and server processing. Instead of waiting for the entire page to be generated, BigPipe immediately sends a page skeleton to the the client so it can start rendering that. Then the remaining content elements are requested and injected into their correct place. From the user's perspective the page is rendered progressively. The initial page content becomes visible much earlier, which improves the perceived speed of the site.

That is no longer a theoretical possibility: you can actually use BigPipe today with Drupal 8, see http://wimleers.com/blog/drupal-8-bigpipe-module-ready.

November 24, 2015

Pages

Add new comment

Updates from Dries straight to your mailbox