Drupal in the cloud

It is not always easy to scale Drupal -- not because Drupal sucks, but simply because scaling the LAMP stack (including Drupal) takes no small amount of skill. You need to buy the right hardware, install load balancers, setup MySQL servers in master-slave mode, setup static file servers, setup web servers, get PHP working with an opcode cacher, tie in a distributed memory object caching system like memcached, integrate with a content delivery network, watch security advisories for every component in your system and configure and tune the hell out of everything.

Either you can do all of the above yourself, or you outsource it to a company that knows how to do this for you. Both are non-trivial and I can count the number of truly qualified companies on one hand. Tag1 Consulting is one of the few Drupal companies that excel at this, in case you're wondering.

My experience is that MySQL takes the most skill and effort to scale. While proxy-based solutions like MySQL Proxy look promising, I don't see strong signals about it becoming fundamentally easier for mere mortals to scale MySQL.

It is not unlikely that in the future, scaling a Drupal site is done using a radically different model. Amazon EC2, Google App Engine and even Sun Caroline are examples of the hosting revolution that is ahead of us. What is interesting is how these systems already seem to evolve: Amazon EC2 allows you to launch any number of servers but you are pretty much on your own to take advantage of them. Like, you still have to pick the operating system, install and configure MySQL, Apache, PHP and Drupal. Not to mention the fact that you don't have access to a good persistent storage mechanism. No, Amazon S3 doesn't qualify, and yes, they are working to fix this by adding Elastic IP addresses and Availability Zones. Either way, Amazon doesn't make it easier to scale Drupal. Frankly, all it does is making capacity planning a bit easier ...

Then comes along Amazon SimpleDB, Google App Engine and Sun Caroline. Just like Amazon EC2/S3 they provide instant scalability, only they moved things up the stack a level. They provide a managed application environment on top of a managed hosting environment. Google App Engine provides APIs that allow you to do user management, e-mail communication, persistent storage, etc. You no longer have to worry about server management or all of the scale-out configuration. Sun Caroline seems to be positioned somewhere in the middle -- they provide APIs to provision lower level concepts such as processes, disk, network, etc.

Unfortunately for Drupal, Google App Engine is Python-only, but more importantly, a lot of the concepts and APIs don't map onto Drupal. Also, the more I dabble with tools like Hadoop (MapReduce) and CouchDB, the more excited I get, but the more it feels like everything that we do to scale the LAMP stack is suddenly wrong. I'm trying hard to think beyond the relational database model, but I can't figure out how to map Drupal onto this completely different paradigm.

So while the center of gravity may be shifting, I've decided to keep an eye on Amazon's EC2/S3 and Sun's Caroline as they are "relational database friendly". Tools like Elastra are showing a lot of promise. Elastra claims to be the world's first infinitely scalable solution for running standard relational databases in an on-demand computing cloud. If they deliver what they promise, we can instantly scale Drupal without having to embrace a different computing model and without having to do all of the heavy lifting. Specifically exciting is the fact that Elastra teamed up with EnterpriseDB to make their version of PostgreSQL virtually expand across multiple Amazon EC2 nodes. I've already reached out to Elastra, EnterpriseDB and Sun to keep tabs on what is happening.

Hopefully, companies like Elastra, EnterpriseDB, Amazon and Sun will move fast because I can't wait to see relational databases live in the cloud ...

Comments

Anonymous (not verified):

Don't forget Mosso! This beasty seems like it supports the applications of the LAMP stack out of the box, and scales pretty much infinitely, all starting at $100 a month.

April 16, 2008
Dries:

Do you know how many Drupal sites are hosted at Mosso, and how big they are?

April 16, 2008
Brent Hardinge (not verified):

I've looked several times a Mosso but the lack of version control support turned me off. (SVN primarily).

April 16, 2008
David Strauss (not verified):

Mosso's architecture has pitiful latency. A stock Drupal installation with nothing on it takes 1.5-2 seconds to load the front page. Mosso considers this acceptable. I considered it a reason to cancel my account.

April 16, 2008
Ryan (not verified):

Hmmm,

The latency is slightly higher than a dedicated server, like .5-1 second, BUT its ALWAYS the same under any load... and you can bypass this with boost caching for anon users.
Compare this to a dedicated host where a spike in activity could cause those dreaded 4-10 second page loads...

On Mosso you can get Slashdotted and Dugg at the same time and not even notice. This is pretty cool for 99 a month... kind of a no brainer. And you CAN deploy to SFTP via bazaar for source control until SVN is added to mosso.

For the record I have had 99% uptime with Mosso for 6 months now. To me they are out of the growing pains from last year.

The next big plus is that because it is a cloud, if something breaks it breaks for everyone, and it has to get fixed quickly... in other words, I never have to send angry emails to tech support to get things fixed... It just works all the time...

Lately they have implemented a new billing structure based on requests... this has generated a lot of heated discussion but to me it is very generous and is meant to future proof for larger enterprise sites to use the cloud fairly. Basically you still get more than your 99 bucks worth and the big sites may have to pay extra for insane traffic... also, there are ways around this via CDN and css/js aggregation to reduce your requests exponentially.

Overall I love Mosso and put all my clients and friends on it because it just works, i get paid money over my 99 per month automatically, the spam control on email is great, webmail is great and performance is amazing for social bookmarking...

The best part is really never having to deal with tech issues or support ever... because even if you dedicated host is managed you usually have to bitch and moan to get anything fixed... and thats even you even notice its broken to start with...

April 18, 2008
Myke (not verified):

My only complaint with Mosso (Now Rackspace Cloud), is their lack of a true email solution: If you have a large mailing list to send out, your email gets delayed if you send out too many emails. I realize this is to eliminate spam, but I'm not sending out spam, everything is 100% opt-in.

-Myke

January 24, 2011
John Willis (not verified):

Great article. I love to see more discussions around Drupal and cloud technology. Those are two of my favorite subjects these days. Here is a link to some other ideas I have about Drupal and cloud technology: Drupal, Dries, and Clouds.

April 16, 2008
ceejayoz (not verified):

Unfortunately for Drupal, Google App Engine is Python-only, but more importantly, a lot of the concepts and APIs don't map onto Drupal.

I suspect these are things Google will be trying to address. Python's just the first language supported, I've little doubt that PHP will come fairly soon as well.

April 16, 2008
John Willis (not verified):

I run a few low profile Drupal sites on Mosso. Also, Mosso has told me they have a lot of customer's running Drupal. However, the primary issue is running highly customized Drupal sites (e.g., Video). I personally love Mosso for simple CMS delivery models. For example my blog site (sorry Wordpress) runs on Mosso. However, when you get into the type of subjects that you have disscussed in this article Mosso might have some issues.

John

April 16, 2008
Anonymous (not verified):

What sort of issues does it run in to? I'd sort of sold myself on Mosso as the next platform I was going to choose for any of my larger scale Drupal projects, but if there's issues with things, I guess I'll have to investigate further.

April 16, 2008
moshe weitzman (not verified):

Yes, I have been getting excited about cloud computing as well. Amazon made a huge announcement this week about persistent storage in EC2. Is like having your own SAN. Also see the Rightscale blog for a non technical explanation.

April 16, 2008
John Willis (not verified):

Actually, I was going to comment on that and how it might effect companies like Elastra. Their big stick is they do DB on a non-persistent EC2. What will they add when EC2 has persistent storage. My guess is a lot but time will tell.

John

April 16, 2008
John Willis (not verified):

I am planning on writing a post on it soon. However, I recorded a session I did at the DrupalNYC Barcamp last month where there was a lot of discussion around Druapl and Mosso. Here it is...

Drupal Barcamp NYC - Podcast #1 Cloud Talk Presentation

April 16, 2008
Chris Cheetham (not verified):

Project Caroline's website uses Drupal as its web-facing framework. Notable because the site is itself hosted on a Project Caroline cloud. Details of the site's technology stack are described here.

April 17, 2008
Andy Smith (not verified):

drupy will yet live!

April 17, 2008
Anonymous (not verified):

I've been burned by Mosso more times than I can mention on our Drupal sites. We started with them late in '06, and they were a sharp outfit running virtual hosts on RackSpace's backbone.

However, since then we have had no shortage of latency problems with their MySQL and PHP services, which appear to be poorly scaled. Pages on our Drupal sites on Mosso have load times of 20 seconds or more, and we've had issues with MySQL latencies leading to screwy database errors in Drupal (a real way to win favor with clients!).

At one point, we had 70 or more domains and subdomains hosted with Mosso. In the process of deleting one site, their control panel managed to overwrite most of the other domain records with incorrect IP addresses. It took us several days to sort through that mess, not including the time it took us to regain our credibility with our clients.

Lately, their latency on the PHP side seems to have gotten worse, and the incidents on http://status.mosso.com seem to become more frequent.

They almost make up for it with great customer service and technical support, but I would never put Mosso in the same sentence as "Drupal scalability".

April 18, 2008
Anthony (not verified):

While I haven't quite been burned by Mosso, it's come pretty close. MySQL latency does seem to be a problem and I agree that their control panel is buggy.

However, I'd have to agree with Ryan. When Mosso goes down, it's very obvious and they go all-out to fix the problem. Their tech support is excellent and has been what kept me persevering with them during the bad times.

I think that it's all about a trade-off. You might get better performance elsewhere with a dedicated server, but you'll need your own sysadmins to watch over it. If you're a small company or don't have lots of spare in-house technical resources, Mosso is probably a good choice.

May 8, 2008
xtfer (not verified):

I have tested both Mosso and Media Temple's GS for Drupal hosting, and found neither to be particularly effective. Mosso was patchy, and MT's grid system would have inexplicable timeout's on a regular basis. This may be acceptable for your personal blog, but doesn't cut it for an enterprise CMS - even a small one.

My experience has been that the MySQL server is the most important component and the key bottleneck. Fix that one - the hardest bit for cloud hosting - and I suspect that the rest won't be that noticeable.

August 12, 2008
Anonymous (not verified):

I got severely burned by MediaTemple (outages, slow), won't try Mosso for that reason, Google Apps unfortunate silence about PHP means it is a non-starter.

I have been using S3 with amazing success (see this article on how to elegantly automate your server backups to S3!) and I plan to move to EC2 at some stage.

Currently I use a dedicated server with several huge Drupals on it and I see that MySQL is taxing the system most (despite adding memory, caching, optimization, etc).

It is therefore attractive to consider using Amazon SimpleDB for storage. I am wondering, and this is my question for this thread, whether there is a way of adding a layer of abstraction between Amazon SimpleDB and Drupal's own database abstraction layer.

I know, at first sight it looks like a slow solution, but if it's not difficult to do it, it should be tested. Considering the load MySQL places on our servers, slowing them down, I'd expect that using Amazon SimpleDB might (with appropriate caching of common queries) actually perform the same or better. What do you thin? Anybody has tried (something like) this?

August 21, 2008
Sandro Franchi (not verified):

We've moved all our customers to Mosso a couple of months ago and we're really satisfied with the service, both in terms of tech support and performance as well.

We've there Drupal 5.x, 6.x, WP 2.5, 2.6.x, etc. and everything just works all the time, no complains at all.

There were a couple of minor issues but the on-line tech support was there in seconds (literally) and with a lot of knowledge and a great attitude they solved these issues almost immediately.

IMO Mosso rates -at least- 4.9 out of 5.

August 29, 2008
Mike (not verified):

Thanks, Dries, for taking up this issue. I am catching up to you in my exploration of Drupal--I am now where you were with this in April 2008!

I am running a smallish Drupal (500 visits a day) on a 512Mb Gogrid server. Nobody here's mentioned Gogrid, but it has advantages over Mosso (key one: ssh access to the server).

I am having difficulty running a full Drupal with modules on a 512Mb server, but I figure this dedicated server will cost $65 a month. This is cheaper than Mosso, though it's not instantly scalable to handle increased traffic.

I would like to check out the Amazon.com cloud system, but it's too daunting for a n00b like me. I have a real job!

Looks like everybody's dreaming the same dream now: making a Drupal site that handles rare large spikes in traffic while most of the time costing less than $100 per month. For now, I'll keep dreaming, and maybe give Mosso another try.

September 25, 2008
Vacilando (not verified):

Don't overlook Scalr - an open source web tool that can easily eclipse RightScale very soon. It allows you to create automatically scaling server farms of any size on EC2 either for free ( http://code.google.com/p/scalr/ ) or for mere 50 USD/month ( http://www.scalr.net/ ). For comparison, RightScale asks 500 USD/month (and 2500 USD setup!) for a similar service. This is to say, future might be closer than we thought!

October 13, 2008
Shawn (not verified):

I am reviewing several CMS' for a large site revision and one factor in settling on a decision is where the CMS will be in the future. This kind of clear thinking about Drupal's future from the creator and guide of the project pushes Drupal to the foreground.

So, you thinking Drupal 8 will be in a fully scalable cloud architecture? Awesome! Keep up the good work.

October 15, 2008
Mark Hadfield (not verified):

We have helped scale Drupal in the Cloud to the point where it can serve over a billion pages per month. Here is a link to the system we used:

http://nscaled.com/solutions.php

March 7, 2009
Anonymous (not verified):

I also got burned by Mosso when running my drupal web site. It wasn't economical either for my site. The other providers didn't provide an easy enough to deploy system. I am not that technically skilled.

I found a new cloud provider called Uptimehost (www.uptimehost.com) I am going to sign up for their free trial and see if it solves my issue!

Anyone else use them?

March 12, 2009
Anonymous (not verified):

Yes, Uptimehost is run by a cheater. Hence, I don't think it is a good idea to sign up. Just look up on the internet for pages about Kaumil Patel, and you will find nasty pages written by his victims and consumer organizations. You also might want to look for all the nasty reviews of the companies he is /was involved in. Examples include: hostMDS, uptimehost, Vistapages, Hostingplex, amongst others.

I would advice to stay away if you don't like a painful experience.

March 20, 2009
Andrea (not verified):

Kaumil is no longer involved, but as of November 2009 they don't have their act together. Support is patchy at best.

Alas, I had high hopes for them.

November 25, 2009
Bola Owoade (not verified):

By the way has anybody tried the Aptana cloud offering with drupal? They do support PHP hosting. What do you guys think?

May 15, 2009
raquel.turrubiates (not verified):

I'm running several big Drupal deploys (over 250,000 unique sessions per day) and a host of little Drupals. Our dedicated infrastructure @ RackSpace consists of 5 web servers, load balanced with varnish, 3 MySQL servers (1 db for the little Drupals, 1master/1slave for the big ones) and 2 file servers. Lately we've started to move the little Drupals to Mosso and the system sort of works, however we've had in fact some downtime due to MySQL cluster failures and one occasion the whole PHP cluster went down. Our little Drupals have around 15,000 sessions/day but I wouldn't dare to move our 250,000 sessions/day there.

July 9, 2009
InfoReto (not verified):

Mosso is now part of Rackspace in http://www.rackspacecloud.com/

Have someone reviewed their current offers? Beside the hosting package there's a server instance and a CDN storage.

August 1, 2009
Anonymous (not verified):

mosso = no ssh access... deal breaker for me when using drupal. the shell speeds up my development time so much that without it, i am lost.

September 2, 2009
Brade (not verified):

I'm sure you caught wind of this. Maybe the problem can now be solved: http://aws.amazon.com/rds/

October 27, 2009
Annie Joseph (not verified):

Its a fairly short notice but Silicon Valley Drupal Group this month's meetup is on 'Drupal IN Cloud'

We generally have very interactive sessions so anybody willing to share experiences with this DRUPAL community is most welcome.

Details of the event: http://www.meetup.com/DrupalGroup

February 20, 2010
Ivan (not verified):

Hi there,

A reliable load-balancing solution for MySQL database is still to be released yet. There's one in theory, but nobody says it runs without having a lot of issues. So no load-balancing for the database.

However there is a solution for load-balancing of IP based virtual hosts which brings High Availability to Drupal (and other Open sources frameworks of course) which is stable and works for Drupal. Check http://www.cloud.bg/en

This one allows Drupal account holders to run their projects on top of a HA cluster and to share resources of the whole infrastructure. It is quite interesting and it is stable at the same time. I have tested Drupal on top of it and got very good results. So you can consider Cloud.bg as a Drupal friendly cloud.

June 18, 2010

Add new comment

© 1999-2014 Dries Buytaert Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Drupal is a Registered Trademark of Dries Buytaert.