Drupal
Examiner.com acquires NowPublic
As reported in the New York Times, NowPublic, a citizen journalism website built on Drupal, was acquired by Examiner.com. Soon, one of the top-100 websites in the world will be running Drupal!Congratulations to the NowPublic team!
(Disclosure: I am an advisor to NowPublic.)
Drupal Gardens
Acquia had two big product announcements at DrupalCon Paris. The first was the general availability of Acquia Hosting, which I'll blog about tomorrow. The second is a status update on "Acquia Gardens" which we first announced in the beginning of 2009.
For those who have not heard about Acquia Gardens, this product will provide an easy on-ramp for people to experience the awesome power of Drupal without having to worry about installation, hosting and upgrading. Think of it as Wordpress.com or Ning for Drupal. Think of it as 'Drupal as a service'.
We announced that the final name for the product is Drupal Gardens. This service is Drupal, so including Drupal in the name emphasizes that point. Plus, this is all about promoting Drupal so we don't want to hide that. Our goal is to make the base service free of charge, and to introduce Drupal to hundreds of thousands of users. Many individuals and organizations want a killer web site, but have no idea that Drupal is a great way to build one or to connect with other websites. Even if they did hear about Drupal, few non-technical people succeed in installing and hosting a Drupal site. I believe Drupal Gardens could play a key role in promoting the viral adoption of Drupal, and the name Drupal Gardens is key to that.
For the same reason, I'd really like Drupal Gardens to stay close to what Drupal does, to work with module maintainers, and give back where we can. For example, it would be awesome if Gardens users could contribute to Gardens, simply by contributing to Drupal -- either by contributing to existing modules that we use to build Gardens, or to new modules that Acquia might contribute. Along the same lines, we want people to be able to export their Gardens site -- the code, the theme and data -- and move of the platform to a any Drupal hosting environment. By doing so, we provide people an easy on-ramp but we allow them to grow beyond the capabilities of Gardens without locking them in. These are the kind of win-win situations that I hope we can create.
We also showed a demo of the current state of Drupal Gardens. The product is in pre-alpha, but we wanted to give you an update and show what we've been working on. The main feature that I demonstrated in my Acquia presentation is a tool we developed called the "theme builder". The theme builder makes it really easy to build a beautiful design for your Drupal site from within your browser without having to write any HTML, CSS or Javascript. The theme builder is enabling technology, and certainly part of my vision of what content management systems should enable users to do: empowering them to quickly and easily assemble powerful websites without having to do any programming.
The theme builder comes with pre-defined themes to start from, color palettes and a custom color selector.
The current plan is to be in the market the beginning of 2010. Gardens is built on, and depends on, the release of Drupal 7. While we don't yet have the exact timing for this (Drupal 7 is ready when it is ready), we do plan to start inviting people to start alpha testing in the next couple of months. If you are interested in taking part in the alpha program, or if you'd like to get notified about the progress of the product, sign up at drupalgardens.com.
Drupal 7 code freeze almost upon us
After some 82 weeks of development beginning in February 2008, no one should be caught by surprise that we are near a code freeze for a release of Drupal 7, the next and best release of Drupal yet. In fact, the Drupal 7 code freeze was originally announced to be on September 1st.
However, as we all know, some of the best patches always happen at the last minute, and there are always last-minute patches that must happen. As part of my State of Drupal presentation at DrupalCon Paris today, I talked about the Drupal 7 code freeze. Since not all of us can be at Paris (and for those that aren't there, we miss you!), I wanted to share the relevant slides (PDF, 85 KB) in this blog post.
Important to know is that I extended the code freeze to Monday, September 7th. At this point all patches are still "acceptable". After the end of DrupalCon Paris, Angela "webchick" Byron and I will review everything that is marked "ready to be committed" and commit all those patches that are really ready.
After this one week extension, we enter a phase we're calling "code slush", to be time-boxed at a strict five weeks. In this period, most patches are allowed, except those that implement new features or functionality -- with some very important exceptions. Up to ten carefully selected exceptions for new functionality (see slides for details), patches that implement important and necessary API changes for existing functionality, and patches that improve usability, accessibility, documentation, and performance will still be accepted. As such, it is important that you start upgrading your contributed modules as soon as possible -- you never know what API issues you might run into and we only have a limited window to address those. After October 15, we will become a lot more strict and focus on the final polish.
As always, Drupal 7 will be ready when it's ready, but not before. We expect -- and hope -- that our huge investment in SimpleTest tests and our automated testing platform, though, will make this the easiest transition from development to release yet. We'll see!
Drupal trademark policy officially available
Just a short time ago, I announced the refresh of Drupal.com. As I announced in my post, Drupal.com has a couple of purposes: one of its key purposes is serving as the current home of the official Drupal trademark policy. As of today, version 1.0 is available and published at http://drupal.com/trademark.
I invite you to read the Drupal Trademark Policy in detail. It's full of illustrative examples, and I hope that we've made it as community-friendly as possible. We can't cover every possible scenario, but I believe it addresses most situations that are likely to occur within our community. It may -- and certainly will -- change over time as we keep in sync with the changing needs of our community and, if necessary, to account for unforeseen situations. That is important to keep in mind.
I've owned the Drupal trademark for a long time. The lack of a Drupal trademark policy doesn't mean the trademark was unprotected -- it was protected by trademark law. The lack of a Drupal trademark policy meant that it was unclear what was allowed and what wasn't allowed, and frankly, that you were bound by trademark law. By creating a trademark policy and a licensing procedure, we've provided us options we did not have before.
The goal of our new policy is to provide guidance and clarity on how the Drupal trademark is allowed to be used. The only community model that really works is one where there is a fair-level playing ground for all people and organizations. Ultimately, that is what this policy seeks to accomplish.
The entire process of developing the policy was a community effort, with help from a variety of legal experts. We worked on the policy over the course of almost two years. A draft version of the policy was posted at http://groups.drupal.org/node/19068, and through the community feedback that developed there, we ironed out many of the wrinkles of my original draft. Larry Garfield, the Drupal Association's current legal representative has provided feedback, and both my own attorney (DLA Piper) and additional attorneys from the Software Freedom Law Center and the Drupal Association were part of the policy's development. To help validate our work, we reviewed other similar policies from sister projects to make sure that we were in-line with the current legal trends in open-source development.
As the owner of the trademark, protection of the trademark falls to me, and is managed by me with the assistance of my attorney, the Drupal Association, and potentially even local Drupal Associations. I personally bear substantial personal costs as part of sustaining the trademark in all its various geographic jurisdictions. To help offset the costs of managing the trademark, the trademark licenses, and to actively pursue those who infringe or inappropriately seek to use our brand, I will sell some advertising space on drupal.com and may also charge a small licensing fee to those that do not qualify for an automatic trademark license (section 1A) and that need to follow the license grant procedure (section 1B). Now the policy is published, I plan to work out the financial details in the next months so stay tuned for an update on that.
Most of you who use Drupal, commercially or otherwise, need not worry about how the new policy may impact you, though I certainly encourage you to study it and to apply for a license if required. For instance, in many cases, you are allowed to use the name 'Drupal' in domain names. Conversely, there are some Drupal domain names in particular that the policy seeks to protect for the good of the community and to create a fair-level playing ground. The introduction of the official policy is only intended to help ensure that the effort of hard-working Drupal contributors is not misappropriated. I think it will make us even stronger, as a community!
Mollom status update and planned improvements
We recently wrote about the fact that the number of messages we've filtered have doubled in three months. All things considered, we're handling well over 200 million HTTP requests each month, making Mollom the largest web service I've ever helped build. Further, since each of these requests is dynamic, they're fairly expensive because we can't apply even simple caching techniques. Each request to Mollom retrieves data, invokes a parser, uses statistical classifiers, and updates reputation models, among other things.
While the response time of the service has always remained good, we've had some recent scalability issues that have affected our ability to react to the constantly changing behavior of spammers. To react well, we must constantly analyze our data and continually retrain our classifiers. We do this asynchronously, using background processes that are not part of the HTTP requests. When we started Mollom, it took ten minutes to analyze our dataset and to train a new classifier. With our current volume of data and frequency of requests, that same operation now takes at least 14 hours. Needless to say, that has affected our ability to effectively deal with spammers, and as a result, the quality of our classifiers have regressed. While that regression is only a fraction of a percentage, it is more than we would like, and if you get hammered badly like many of our users, it is noticeable. Not good.
To deal with the pains of, frankly, our unexpected success and growth, we did (or are in the process of doing) the following three things.
First, with the help of hosting company OpenMinds (these guys rock!), we upgraded one of our existing servers in Europe (for horizontal scaling), and launched our first server in the United States (for vertical scaling). Because of our large volume of data, and since our analysis is very data intensive, much of the work we do is I/O-bound. So, we've added more RAM to our servers, configured the disks in RAID-1 to mirror their contents for better read and write performance, and purchased 64GB solid state disk drives (SSD) that are providing random access times at least 150 times faster than our regular hard disks. With the extra RAM, the RAID-1 configuration, and the solid state disks, it now much faster to train a new classifier; a significant improvement making us much more agile in fighting spammers. The hardware upgrades are almost complete. Solid state disks, by the way, are seriously hot stuff.
Second, when you're processing more than 200 million HTTP requests a month, it becomes really hard to figure out what is going on, and doubly hard to determine where and why classification mistakes are being made. Simply put, Ben and myself started to feel like the characters in the story of the blind men and the elephant as we tried to figure out why some spam was slipping through. To cope, we've made important architectural changes to our backend software allowing it to learn faster and increasing our ability to debug it on the fly. We've worked on these changes for more than two months, and last weekend, we made an important breakthrough that allowed us to visualize all our data in a completely new way. We're now able to generate heat maps of our algorithms to identify the weaker areas or the areas that are currently under attack. Already, we've identified a number of areas where we will improve our algorithms to be more effective. In other words, expect Mollom's accuracy to improve over the next couple of weeks as we translate our new insights into algorithmic improvements.
Third, with the help of Damien Tournoud, we fixed an important bug in the Drupal Mollom module, while also improving its logging abilities. The bugfix should prevent incorrect CAPTCHA results from being accepted when (or if) a Mollom server is unavailable, and the improved logging makes it easier to understand specific attempts to circumvent Mollom CAPTCHAs on your site. With the new output, for example, we've already seen that some spammers have adjusted their scripts to specifically target Mollom-protected sites, and we've also learned that some caching modules cause conflicts with Mollom in some configurations. In addition, Dave Reid, our new co-maintainer for the Drupal Mollom module, has committed many smaller but no less important improvements, bugfixes and clean-ups to the Mollom module. Last night, we packaged all these changes into a new release of the Mollom module for Drupal 6. Upgrading is certainly recommended.
We believe that the combination of all these elements will significantly improve our ability to combat spam, and that they will form the platform that will carry Mollom to the next level. Stay tuned as we complete the roll-out of all our changes.
FrOSCon
This weekend, I will be giving the keynote presentation at FrOSCon in Sankt Augustin Germany, near Bonn. There will be a dedicated Drupal developer room there on both Saturday and Sunday (August 22 and 23). FrOSCon is a relatively new, but rapidly growing conference covering open source technologies of all kinds. This is the fourth year that Drupal has been involved and it has become one of the most important Drupal events to attend every year in Germany.
On Saturday we'll we having a Drupal 7 code sprint, coordinated via IRC (#drupal at FreeNode) and Twitter so that local and remote developers can cooperate in real time. Angie Byron, my Drupal 7 co-maintainer, will be with us the whole time live on IRC from Canada. She is pulling an all-nighter for us, so let's make sure to get some hot patch action! Anyone whose time zone is remotely compatible is invited to join us – the sprint will run roughly from 10:00h to 18:00h European Central Summer Time (6 hours ahead of US Eastern).
On Sunday, there will be several Drupal presentations in the developer room as well as lots of chances to meet other Drupalistas, get support, install Drupal on your machine and so on. There is a planning wiki for the Drupal side of the event at: http://groups.drupal.org/node/24601. My keynote presentation will be at 12:45h on Sunday "The Secrets of Building and Participating in Open Source Communities."
Keynote abstract:
Everyone knows that the most successful open source projects and vendors benefit from a thriving community. But how exactly is this done? In this session, Drupal project lead and Acquia co-founder and CTO Dries Buytaert will share his secrets for building and participating in a thriving open source community, including its commercial ecosystem. He'll describe the mindset, mechanisms, and practices that are essential for a thriving project and community.
Hope to see you there!

