Volume 12 / Issue 12

DOI:   10.3217/jucs-012-12-1783


Persian/Arabic Baffletext CAPTCHA

Mohammad Hassan Shirali-Shahreza (Yazd University, Iran)

Mohammad Shirali-Shahreza (Sharif University of Technology, Iran)

Abstract: Nowadays, many daily human activities such as education, trade, talks, etc are done by using the Internet. In such things as registration on Internet web sites, hackers write programs to make automatic false registration that waste the resources of the web sites while it may also stop it from functioning. Therefore, human users should be distinguished from computer programs. To this end, this paper presents a method for distinction of Persian and Arabic-language users from computer programs based on Persian and Arabic texts. Our proposed algorithm is based on adding a background to the image of a meaningless Persian/Arabic randomly generated word. This method relies on the difficulty of automatic separation of background from Persian/Arabic writing, due to the presence of many diacritical dots and signs.In this method, the image of a random meaningless Persian or Arabic word is shown to the user and he is asked to type it. Considering that the presently available Persian and Arabic OCR programs cannot identify these words, the word can be identified only by a Persian or Arabic-language user. This method also can be used to prevent program attacks, resource waste and performance reduction. The proposed method has been implemented by the Java language. The generated words are tested, using ReadIris and Omnipage OCR systems. These OCR systems were unable to recognize these words.

Keywords: BaffleText, Persian and Arabic text, completely automated public Turing test to tell computersand human apart (CAPTCHA), internet security, optical character recognition (OCR)

Categories: I.4.0, I.4.5