In order to comment on a Terry post, you have to fill out a form which looks like this:
If you’ve spent any time at all on the interwebs, you’re probably fairly familiar with this kind of thing. They’re called CAPTCHAs, which stands for “Completely Automated Public Turing-test to tell Computers and Humans Apart”, a rather awkward acronym which nonetheless admirably describes their function of screening the automated scripts which might otherwise be hawking their manhood-enhancing wares on our poor, unsuspecting readers. The basic idea behind CAPTCHAs is to use a test which is easy for humans, but impossible for current AI systems – such as reading highly distorted text.
But this specific CAPTCHA used by Terry, and many other sites around the web, is of a very special variety. It is called ReCAPTCHA, and is the work of a group at CMU lead by the brilliant Luis Von Ahn, whose goal is to harness the work people do when filling out CAPTCHAs into a useful purpose. Believe it or not, every day an estimated 150,000 man-hours are spent world-wide filling out these infernal boxes! Wouldn’t it be great if that time could be spent doing something useful? That’s the idea behind ReCAPTCHA…
What ReCAPTCHA does is to combine bot-filtering with another useful project – the digitization of hard-copy texts such as old books. Modern OCR is highly accurate (>99%), but there are still cases where an OCR is unable to ready a given word accurately – usually the result of some damage or distortion to the text itself. In these cases, the given word is converted into an image for ReCAPTCHA and fed to a human, who can succeed where the computer failed. So every time you fill out a ReCAPTCHA, you are helping to digitize and preserve old books!
This is an incredibly clever example of a new field of development called Human Computation (also called crowdsourcing). The idea is to out-source certain elements of computation to humans, who can perform these tasks better than the computer. The challenge comes in creating the incentive for a human to participate. One technique, as used in ReCAPTCHA, is to harness work which is done by humans anyways, such as filling out CAPTCHAs. Another used by Von Ahn’s group is to turn the work into a game – making the incentive fun! To this end they created gwap (Games With A Purpose)- a site devoted to games which accomplish useful work, from image-tagging (useful for improving image web-search and making the web more accessible to the visually-impaired) to text-summary. By their estimates, if their image-tagging game ESP was played the same amount as popular flash games such as Bejewelled, all images on the internet would be completely tagged in a matter of months. The power of procrastination, properly managed, is truly a wonder to behold!
But the power of the Human Computation paradigm extends beyond those application in which it is explicitly designed. Many examples of internet social networking sites can be seen as a form of Human Computation. For example, sites such as Digg or StumbleUpon act as a powerful filter for vast sea of content available online – only the best content bubbles to the top (in theory anyway…). Furthermore, the large data collection of Audioscrobbler and Last.fm acts as a form of music-similarity algorithm, simply by clustering artists based on the people who listen to them. Dave previously wrote about the power of Google Trends to predict flu epidemics. There is exciting potential here.
Human Computation is an incredibly powerful idea which will continue to develop more interesting and useful applications as the techniques are developed further. If anyone can think of ways we could harness the power of procrastination to solve the problems we discuss on Terry, we could really be in business!