Ever heard of CAPTCHA? You probably have, even if you aren’t familiar with the acronym. CAPTCHA (or Completely Automated Public Turing test to tell Computers and Humans Apart) is a sort of challenge /response test to allow you to prove that your are, in fact, not a computer. The test is simple: a subjective image is displayed, usually including warped or otherwise fuzzy letters, and you’re supposed to tell the system what it is. It turns out that computers aren’t that good at it (less than 80% accurate), but we humans are aces at it (over 99% accurate), so it’s a remarkably simple and effective way to separate the sentient from the not-so-sentient.
CAPTCHA is commonly used by millions of web sites that want to block automated programs from exploiting their services. For instance, Ticketmaster uses CAPTCHA to ensure that a scalper-bot doesn’t swoop in and buy up all the good seats. Many blog sites use CAPTCHA to prevent spam in feedback. The list goes on and on.
With of millions of web sites relying on CAPTCHA, there are estimated to be over 100 million transactions per day. Now, a project called ReCAPTCHA has found an ingenious way to harness all this hard work for a great cause. ReCAPTCHAs could be used to transcribe the contents of scanned text at the rate of about 160 books per day.
Traditional CAPTCHA checks the user’s input against a known value that corresponds to the image, but since OCR can’t translate the scanned text in the first place, ReCAPTCHA doesn’t know what it’s presenting to the user. So how is it supposed to know whether the response is accurate or not? It pairs the unknown word with a known "control" word, and assumes that if you can read one word accurately, then you can read the other one, too. By presenting the same unknown image to several users and indexing the results, ReCAPTCHA eventually "learns" the correct answer.
ReCAPTCHA is available to anyone who wants to use it on their site, and the project team has even developed plug-ins for popular blogging engines like WordPress. They even have a tool that obfuscates your e-mail address, so you can confidently provide trackbacks on comments that you leave.
Give it a try. You’ll help make countless tomes of wisdom digitally accessible for future generations.