Drupal, the semantic web and search

All major search engines, including Google and Yahoo!, are moving aggressively trying to capture structured data. This isn't exactly a surprise because it provides tremendous opportunity. Let's take the example of product search. Imagine the web as a huge database of millions of products, and search engines like Google and Yahoo! giving you a rich set of controls to filter by price, availability, color, shipping cost, user ratings, and more. Wouldn't it be great to be able to search all the world's products from a single page with a single interface? I'd think so too.

It is waiting to happen; we just have to connect the dots. That is, we have to make Drupal emit structured information.

Hundreds of thousands of Drupal sites contain vast amounts of structured data, covering an enormous range of topics, including product information. Unfortunately, that structure is hidden deep in Drupal's database and doesn't surface to the HTML code generated by Drupal. As such, search engines can't pick it up as a product, and they'd fail to include it in their world-wide product database.

I first talked about the semantic web and Drupal in my DrupalCon keynote last year in Boston. In my presentation, I laid down the challenge that we need to put fields in core and make them first class citizens. Once fields are thus empowered, they can be associated with rich, semantic meta-data that Drupal could output in its XHTML as RDFa. For example, say we have an HTML textfield that captures a number, and that we assign it an RDF property of 'price'. Semantic search engines then recognize it as a 'price' field. Add fields for 'shipping cost', 'weight', 'color' (and/or any number of others) and the possibilities become very exciting. I envision a Drupal core CCK with the power to do just that.

Here is another example. Imagine a standard Drupal node-type called 'job'. The fields in the job node-type would have RDF properties associated with them mapping to salary, duration, industry, location, and so on. Creating a new job posting on a Drupal site would generate RDFa that semantic search engines like Yahoo!'s SearchMonkey would pick up and the job would be included in their world-wide job database.

Technologies like this disintermediate so many existing websites and organizations that it makes my head spin. It is too great an opportunity for us to pass up on. By adding semantic technology to Drupal core, I think we can make a notable contribution to the future of the web.

This kind of technology is not limited to global search. On a social networking site built with Drupal, it opens up the possibility to do all sorts of deep social searches - searching by types and levels of relationships while simultaneously filtering by other criteria. I was talking with David Peterson the other day about this, and if Drupal core supported FOAF and SIOC out of the box, you could search within your network of friends or colleagues. This would be a fundamentally new way to take advantage of your network or significantly increase the relevance of certain searches.

I can has semweb in Drupal core?

Mollom for SilverStripe

SilverStripe users rejoice because Dieter Orens created a Mollom module for SilverStripe, an Open Source CMS developed in PHP. The module is available from http://mollom.silverstripe.be. The module only implements Mollom's CAPTCHA service at this point. Thanks Dieter!

ICANN using Drupal

ICANN (Internet Committee for Assigned Names and Numbers), the non-profit organization that oversees the use of Internet domains is using Drupal at http://public.icann.org. They are using Mollom too!

Public icann

Addison Berry new Drupal documentation team lead

For the past few years the Drupal Documentation Team has been led by Steven Peck (sepeck). Steven was the first person to take on this role, and he has done a great job. Not only has he grown the documentation team to include a lot of talented and hard-working volunteer writers, he has overseen the restructuring and reorganization of Drupal.org's documentation handbooks, greatly improving their structure and accessibility. Thank you Steven for the great work!

Like so many Drupal contributers, Steven works on Drupal completely as a volunteer. His day job has been demanding a lot of time lately, and he has decided to step down from being the Documentation Team Leader. That means it is time to pass the torch to the next person who can then sprint with it for a while.

One great thing about the Drupal community is seeing people grow into new roles and take more responsibility upon themselves. This is certainly the case for Addison Berry (add1sun), who in her two years working with Drupal has become involved with virtually every aspect of the Drupal project. Lately Addi has been more and more active with the documentation team, making her a clear choice in my mind to carry on where Steven left off. I'm therefore very happy to announce that Addison Berry, aka add1sun, is the new documentation team leader, effective immediately. Keep up the great work, Addi!

Addison Berry
Addison Berry at DrupalCon Yahoo! in Sunnyvale, 2007.

Ten million spam attempts blocked

Last weekend, Mollom blocked the ten millionth spam attempt. That is ten million tiny contributions to making the web a nicer place. We've been growing pretty fast. Drupal is still the main platform for users with Mollom subscriptions, with Joomla! coming second, and Wordpress third. Milestone weekend!

Updates from Dries straight to your mailbox