Drupal 5: performance

With the release of Drupal 5, you might be wondering which version of Drupal is faster -- the latest release in the Drupal 4 series, or the new Drupal 5?

Experimental setup

I setup a Drupal 4.7 site with 2,000 users, 5,000 nodes, 5,000 path aliases, 10,000 comments and 250 vocabulary terms spread over 15 vocabularies.

Next, I configured the main page to show 10 nodes, enabled some blocks in both the left and the right sidebar, setup some primary links, and added a search function at the top of the page. I also setup a contact page using Drupal's contact module. The image below depicts how my final main page was configured.

A Drupal 4.7 main page

Furthermore, I made an exact copy of the Drupal 4.7 site and upgraded it to the latest Drupal 5 release. The result is two identical websites; one using Drupal 4.7 and one using Drupal 5.

Benchmarks were conducted on a 3 year old Pentium IV 3Ghz with 2 GB of RAM running Gentoo Linux. I used a single tier web architecture with the following software: Apache 2.0.58, PHP 5.1.6 with APC, and MySQL 5.0.26. No special configuration or tweaking was done other than what was strictly necessary to get things up and running. My setup was CPU-bound, not I/O-bound or memory-bound.

Apache's ab2 with 20 concurrent clients was used to compute how many requests per second the above setup was capable of serving.

Drupal page caching

Drupal has a page cache mechanism that stores dynamically generated web pages in the database. By caching a web page, Drupal does not have to create the page each time it is requested. Only pages requested by anonymous visitors (users that have not logged on) are cached. Once users have logged on, caching is disabled for them since the pages are personalized in various ways. On some websites, like this weblog, everyone but me is an anonymous visitor, while on other websites there might be a good mix of both anonymous and authenticated visitors.

When presenting the benchmark results, I'll make a distinction between cached pages and non-cached paged. This will allow you to interpret the results with the dynamics of your own Drupal websites in mind.

Furthermore, a Drupal 5 installation has two caching modes: normal database caching and aggressive database caching. The normal database cache is suitable for all websites and does not cause any side effects. The aggressive database cache causes Drupal to skip the loading (init-hook) and unloading (exit-hook) of enabled modules when serving a cached page. This results in an additional performance boost but can cause unwanted side effects if you skip loading modules that shouldn't be skipped.

Through contributed modules, Drupal also supports file caching which should outperform aggressive database caching. I have not looked at file caching or any other caching strategies that are made available through contributed modules.

Results

Drupal performance

The number of pages Drupal can serve per second. Higher bars are better.

The figure above shows that generating a page in Drupal 5 is 3% slower than in Drupal 4.7. However, when serving a cached page using the normal database cache, Drupal 5 is 73% faster than Drupal 4.7, and 268% faster when the aggressive database cache is used.

What does this mean when looking at the overall performance of a Drupal 5 website? Well, the effectiveness of Drupal's page cache depends on a number of parameters like your cache expiration time, the number of authenticated users, access patterns, etc. To emulate different Drupal configurations, we modified Drupal 4.7 and Drupal 5 so we could look at performance for a range of page cache miss rates.

Drupal performance

The relative performance improvement of Drupal 5's normal database caching compared to Drupal 4.7's database caching. A miss rate of 0% means that all page requests result in a cache hit and that all pages can be served from the database cache. A miss rate of 100% means that all page requests result in a cache miss, and that we had to dynamically generate all pages.

The figure above shows the relative performance improvement of Drupal 5 compared to Drupal 4.7. We observe that Drupal sites with relatively few cache misses (typically static Drupal websites accessed by anonymous users) will be significantly faster with Drupal 5. However, Drupal sites where more than 1 out of 2 page requests results in a cache miss (typically dynamic Drupal websites with a lot of authenticated users) will be slightly slower compared to an identical Drupal 4.7 website.

To me these graphs suggest that for most Drupal websites, upgrading to Drupal 5 will yield at least a small performance improvement -- especially if you properly configure your page cache's expiration time. Furthermore, they suggest that for Drupal 6, we need to look at improving the page generation time of non-cached pages. Let's make that an action item.

Comments

Robert Douglass (not verified):

I love benchmarks! Dries, as much extra information as you can afford about running your tests would be appreciated. For example, did you use ab as an authenticated user (-C PHPSESS=12345...), or just with the cache turned off?

I've taken the liberty of reporting this article on Digg.

Great work. =)

February 7, 2007 - 17:15
2bits -- Khalid (not verified):

Here are three file based caching contrib modules:

The first two work out of the box. The last one is more of an API, with one module using it (for panels module).

February 7, 2007 - 20:25
ted (not verified):

Dries, excellent work!

However, these results are slightly misleading. While they show that Drupal *serves* pages slightly slower, they don't tell anything about how fast a page will render in a user's browser.

The number of files, browser & proxy caching, and related all affect the end user and the speed at which they browse a Drupal site.

So while it might be 3% slower to render a page in 5, I would argue that the pages are actually served faster (e.g., downloaded faster) because of such things as the CSS preprocessor and mod_expires caching, just to name a few.

February 7, 2007 - 20:43
Dries:

You're right, Ted. These benchmark do not provide the full picture and discard some important improvements we added to Drupal 5. Over the next couple of weeks (time permitting), I'll try to come up with a more extensive/accurate benchmark suite -- probably based on Apache's JMeter. Should be fun! :)

February 7, 2007 - 22:27
barone (not verified):

Great Job! If you are going to write this full picture, provide us some data on the MySQL runtime information, if possible.

March 11, 2007 - 16:39
Tom (not verified):

I have been putting off upgrading, but that allays one of my worries (that D5 would be slower).

The thing that I've been meaning to benchmark is
- CCK vs native content type
- Views on/off
- statistics on/off

Stuff like that. If you have time to do any benchmarking of that type, I'd love to see the results! I love the Views/CCK/Contemplate combo - very powerful. But to paraphrase Stan Lee from the Spider Man comic, "With great power there must also come great server load". I'm always curious just how great the added load is.

February 8, 2007 - 02:58
Luiso (not verified):

I think that these differences are brutal. I don't know if the statistics are good, but otherwise I have to say that the people of the project Drupal are very good. I'm very happy to use their system.

February 8, 2007 - 08:19
Antonio Ortiz (not verified):

I was not sure about Drupal 5, but now I am.

February 8, 2007 - 09:15
bhaskar (not verified):

Did you also try and enable the "aggregate and compress CSS" option in Drupal 5? That should give you twofold benefits, (a) it conserves bandwidth and (b) it will allow Drupal to serve more requests per second.

I would be greatly interested in seeing the bandwidth statistics for Drupal 4.7 and 5.0 as well.

Thanks for the good work.

February 8, 2007 - 15:47
Chris Johnson (not verified):

I'd prefer to see benchmarks which are closer to the environment most people see on hosted servers, since that's where most Drupal site installations will probably end up.

It might take some research to get an accurate idea of what the "average" environment might look like. But my gut instinct from experience says most hosts are not yet running MySQL 5 and PHP 5, and are also not running APC.

Thus, my curiosity is how this same benchmark look on MySQL 4.1 and PHP 4.3.10. On the surface, it seems like the results would be similar. But maybe not.

I'm really glad to see the data for cache hit performance. Some sites I run have the majority of their traffic generated by anonymous users. Other sites I'm involved with have 100% authenticated users; anonymous users only get to see the login page. Looks like we may have to optimize our service stacks when we try to move those people to D5.

February 9, 2007 - 17:45
Caleb (not verified):

How can this be true:

"...we had a similar problem a while back before we moved servers. I guess it is a cache clearing problem tying up the database, and causing a lot of locks making every wait. We run with the cache off for this reason, as well as a server specific reason. We did not need the cache so far anyways.

Try turning off the cache and see what happens. Yes, it is counterintuitive, but it is true: the cache can be a scalability issue for Drupal. In your case it may hurt something else, YMMV.

Another thing is to install the lock elimination patch, and switch to InnoDB on some tables (cache, session, ...etc.)"

Reading that the cache itself can be a scalability issue really makes me scratch my head. Is there anything to this person's comments?

February 10, 2007 - 05:30
Histrionic (not verified):

It might be interesting to run these tests on an automated/regular basis as part of development, including some of the modifications/additions suggested in the comments. While it's great that so much code is contributed by volunteers, it seems like this kind of data can help in profiling development efforts and produce a continuous focus on performance.

This reminds me of the story that was told about Safari development on Mac OS X. Developers are/were supposedly not allowed to check in changes that negatively impact performance unless they can check in another change that offsets that drain.

February 10, 2007 - 18:47
Chris Johnson (not verified):

Running the tests regularly is a good idea. That's something MySQL did from early on -- provide scripts to run benchmarks and regression tests, and actually run them regularly.

February 12, 2007 - 13:00
bertboerland (not verified):

FYI, this article has been referenced in CMS Wire. Not very good (there is no Drupal v4) and without adding new content, but still a reference.

February 16, 2007 - 19:43
John Adams (not verified):

What kind of hardware did you use for the clients?

Thanks
John

March 2, 2007 - 16:45
Dries:

The clients ran on the same machine as the server. A description of the server is available in the original post.

March 2, 2007 - 19:49
John Adams (not verified):

Interesting... so even though this benchmark is probably good for comparing performance of 4.7 vs. 5(your originial intent), it is not a good indicator of the performance capability of the server or the software themselves. This is because the load on the server increases exponentially as the clients request go up.

It would be good to know raw benchmark numbers for the performance of Drupal on common hardware. I will try to get something set up and run some benchmarks where a server is getting hit from multiple different client machines.

John

March 2, 2007 - 20:04
blamcast (not verified):

Very useful information. I recently did some of my Drupal benchmarking after changing hosts, but nothing as in depth as this.

March 21, 2007 - 13:48
Reed H (not verified):

Interesting.

Can you put actual values on the graph though? It's hard to see what they should be. Or post your raw data somewhere?

Thanks

Reed

April 2, 2007 - 16:51
Julien (not verified):

Empty Drupal 5 without APC cache on a Xeon 2 GHz server: 250 ms. With APC cache on the same basis: 50 ms.

I will try to put random content and post back my benchs.

April 17, 2007 - 23:11

Add new comment

© 1999-2014 Dries Buytaert Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Drupal is a Registered Trademark of Dries Buytaert.