FAQ

How do you detect the software?

The system uses a set of proprietary rules and heuristics to detect what software empowers a particular website. These rules try to detect certain fingerpints that are specific to each of the systems we detect.

There are hundreds of publishing tools that we don't have detection rules for. Worse, the publishing tools we can detect, aren't always detected correctly.

There could be several reasons for that. Fingerprints might have gotten lost due to modifications or customizations by the users of these tools. Because of redirection or frame hell. Also, the fingerprints can change from one version to another, and our system's rules might not have been updated yet. As a consequence, there can be false positives and false negatives.

That said, we try to be accurate and think we do an OK job. We're happy to improve our rules based on feedback and/or bug reports.

What sites do you crawl?

The crawler does not consider subsites (i.e. example.com/flowers). Furthermore, the crawler attempts to filter out URLs that might belong to spam networks. Uncommon or fairly unique subdomains (i.e. flowers.example.com) are discarded. In the vast majority of the cases, these belong to spam networks but we realize that this excludes many bloggers.

How often does it crawl my site?

Not often. Maybe a handful times a year.

© 1999-2007 Dries Buytaert Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Drupal is a Registered Trademark of Dries Buytaert.