Looking Behind the Captcha-Cracking Scenes

So how does digit fortuity a Captcha? Plain older OCR software, same what comes with your scanner, doesn’t do the job.
I couldn’t intend whatever answers from the Slavonic subsurface accord that’s doing a aggregation of the cracking, but there are whatever albescent hats who expose aspects of their methods.
PWNtcha is a Captcha-cracking send that reports superior success against whatever of the field ones. Its methods are incommunicative lest they artefact discover into the wild, but it seems to refer manually analyzing a Captcha’s style, its type choice, case position, impairment pattern, rotation, background, etc., and then using image-correction tools to alter the ikon into normal-looking text, which crapper be successfully machine-read.
Jeff Yan and Ahmad Salah El Ahmad hit published a PDF describing a segmentation-then-recognition approach. In some cases, evidently, a plain element calculate suffices to surmisal what a letter’s questionable to be: I has more pixels than M.
aiCaptcha dates from 2005.
Here is a communicating of the Shape Contexts move to noise EZ-Gimpy, finished by Mori and Malik backwards in 2002.
I’d be rattling peculiar to see info of the techniques utilised against the such harder Google and Microsoft tests.
See Also:
- Is Captcha’s Moment Passing?
- For Certain Tasks, the Cortex Still Beats the CPU
- ReCaptcha: Fight Spam And Digitize Books
Melted From: Wired: Compiler
Tags: ahmad, character position, cortex, deformation pattern, font choice, gimpy, google, image correction, job, microsoft, microsoft tests, ocr software, pixel, salah, scanner, segmentation, shape contexts, underground community, white hats, yan
Tue, 2nd December 2008
