How to create an unbreakable CAPTCHA |
![]() |
![]() | ![]() | |
![]() | ![]() |
![]() | ![]() | |
![]() | ![]() |
How to create an unbreakable CAPTCHA |
By now, we're all too familiar with CAPTCHA's, those random bits of characters that look something like this:
We're also too familiar of how frustrating it is to have to read mutated or barely recognizable characters into a text box. Even better, these images, which are meant to prevent scripts from mass-mailing or mass-spamming, aren't even doing what they're supposed to anymore:
http://www.technologyreview.com/web/21519/page1/
There is (sadly) a lot of $$ to be made from spamming, and so a lot of research is being done to create scripts which can "read" the characters in the image.
But there is another way - a way that's easier than having to read badly mangled characters, and impossible for computers to crack.
The Google Image labeler, based on an idea developed by Luis von Ahn, is the key to creating an easy yet unbreakable CAPTCHA.
It’s clear that computers are not at the point where they can decipher things within an image – and they won’t be for a long time. In other words, the hurdle that caused Luis to create his word recognition games (computers being unable to break down what’s in an image) could also be the hurdle that stops a new kind of captcha from being broken.
This is how this new “CAPTCHA” would work. An image from the image labeler database is displayed. The user is given a text box, and much like the game, has to enter a word that describes the image or anything in the image. In other words, we ask the user to play one quick round of the game.
The user enters a word, and we check if it’s one of the valid words that’s been used to describe the image. If it is, the user passes. If not, we show a new image and ask them to try again.
This is obviously far easier than trying to read cryptic characters that are blocked by lines or written in strange colors, while at the same time making it nearly impossible for any computer to crack. And unlike most "image CAPTCHA's" out there, the database would be millions of photos, so caching words for images would not be a feasible option.
This new CAPTCHA system could be offered by google through an API, so that any website would have access to it. The API would simply give an image URL and then validate the input word - users would not be given access to the list of words that describe the image. Obviously at first there would be the problem of a language barrier, but in time (as the database is built up), I think this would be overcome.
Additionally, the unmatched inputs into the CAPTCHA system could be used to further build the image keyword database.
Some samples follow. These images were grabbed straight out of the Google Image Labeler game.
For this image, the image labeler database already has the following entries:
So, if a user enters any of these words (there's not much else they could enter), they would pass the captcha
Another sample:
Google Image Labeler Database:
Again, the user would likely pick one of these 6 words as there's not much else.
For a human, picking words that describe the picture is quick (far easier than reading 6 characters). Show these images to even a smart AI, though, and it will have great difficulty coming up with these words.
The perfect CAPTCHA! Easy for humans, nearly impossible for computers. And if hackers try to get "smart" and start simply guessing common words like colors - well then, make it 2 images instead of just 1. Still simple for humans while leaving no room for chance.

What do you think? Creating an unbreakable CAPTCHA
Description Describes a method to create a better captcha using Google's image labeling technology |



