Tuesday 26 February 2008

CAPTCHA and the Next Generation.

If Ray Kurzweil is to be believed (and I for one subscribe to his core hypothesis) we should at some point in the not-too-distant future (say within 10 years) bear witness to the birth of the next generation of life on this planet.

I don't really mean anything as outrageous as robots storming Parliament or the White House. The coming Singularity will arrive in a series of jumps, or shifts. In the past, these jumps, or bullet points in history, have been notable as things like the adoption of language. Once language had established itself, we evolved to the stage where language could be visually recorded. Then came the ability to record the sound of language itself. Not long after this, we invented computers. Very shortly after this, the Internet blossomed into existence allowing the completely free, global exchange of pretty much any kind of information, including written and spoken language as well as the more recent incarnations of communication, photographs and moving images.

The communication explosion is upon us. There is a global community out there ravenously feeding itself with the writings, images, sounds and movies of countless millions of people. Cultural mashups are already happening, enabled by the thrust of this new technology. But where are the changes that are necessary to allow this explosion to accelerate toward the next step?

The changes are already happening. One example of this is the recent cracking of the CAPTCHA test we all know and hate which is necessary to prevent the automatic creation of email and other online accounts by computers.

Previously, any web site which had a 'Register' page was vulnerable to attack by another computer which would visit the page and 'make up' a whole profile, complete with name, email address and other information which would then be used by the attacking computer(s) to advertise goods or services on the victim site's message boards, forums or through email or personal messaging, all using the freshly created fake profile.

To combat this practice, the CAPTCHA test was created. This test — which often presents users with difficult to read sequences of letters and asks for these letters to be entered with the registration information — is to allow the site to differentiate between a computer trying to create an account and a real person trying to create an account. The test works because computers aren't very good at converting the image into text, whereas humans are very good at recognising these patterns and pass the test with relative ease. This test, therefore, prevents computers from registering for many thousands of email accounts every day.

However, the news that a group of hackers has beaten the CAPTCHA system on Google's Gmail registration page is — as well as potentially being both a security and privacy threat — one example of how the Internet is getting more intelligent, even if at the moment that intelligence still comes from people.

However, consider that with the breaking of CAPTCHA, the technology now clearly exists to allow computers to read text in images very easily, even when it is somewhat scrambled as in these tests. Marry that software technology to a database of images (say Google images, which is itself a text-based image search engine, or Microsoft's unbelievable Photosynth web-based image processing application) and you have a level of cross-referencing and machine-based semantic processing way above anything we have now.

Of course, we've had text recognition software for a while now. Number plate recognition and OCR systems to read cheques and other machine-readable stationery have been around for many years. The difference is that now, any computer connected to the World Wide Web has access not only to images, but increasingly the means to understand the images and the text based content of them.
It is easy to imagine a computer algorithm refining itself by learning from images scraped from the Web, bypassing the need for the algorithm to be refined by a human programmer.

This is just a small step, but how long will it be before people lose touch with exactly how this process works? I know that these processes are some way above my ability to understand them. The complex network of friends, applications and connections hosted by Facebook is utterly beyond me, maybe even partially beyond Facebook themselves as it seems to be quite difficult to completely remove any one user from its database, just as it seems to be difficult to remove a single memory from a human mind.

The algorithms involved in grouping people and applications or words and images in such a way as to contextualise them are difficult problems and I believe that, at some point, the very people who laid the foundations of the systems which are solving these problems will learn to rely on the systems, rather than create them.

At that point, we relinquish a level of control of the Internet to machines. Grandiose that may sound, but — even if at a subtle and seemingly unimportant level — the changes are happening. A lot of information is already on the web and I for one know that I rely on the web as well as my own skill set and memory to live my life and do my job. The line between the Internet and us is blurring on many levels at an increasing rate. In 1990, we had just about perfected the ability to view a web page with static information on it. In 2008, just 18 years later, we can organise large events. The September 11th attacks were largely believed to have been organised — at least in part — using the Internet and its various services. Just over 6 years on and the Web is already a level ahead of what was available then. YouTube, MySpace, Facebook and the explosion of online social media had not yet taken off when the planes hit the Twin Towers of the New York World Trade Centre.

We are in the middle of a fantastic age of information, technology and communication.

Hopefully, something wonderful will happen to us through its global adoption.

No comments: