• Tiada Hasil Ditemukan

CHAPTER 2: LITERATURE REVIEW

2.1 Introduction

Document image analysis covers the algorithms which they transform documents into electronic format suitable for storage, retrieval, search, and update process. Optical Character Recognition (OCR) systems convert graphical images into editable texts. The OCR technology is now widely used, and research and development on its applications is on-going (Parvez & Mahmoud, 2013).

2.1.1 Printed / Handwritten Text

OCR systems are categorized into two main groups, based on the type of input data entered into the systems: "Printed" or "Handwritten". In printed mode, various fonts of machines, computer keyboards, printers, and so on are considered as input data. In this mode, the inputs usually have a good quality, because the machines generate them. Hence, the recognition process is simpler than the second group, and efficiency of the system are usually noticeable. These systems are generally used for recognizing printed documents such as books, newspapers, and other similar documents.

In contrast, handwritten documents are produced by different people in different situations.

Hence, handwritten characters recognition is considered as one of the most challenging and exciting areas of research in Pattern Recognition (PR) domain. This is partly due to the diversity of sizes, orientation, thickness, and dimension of characters in handwritten texts resulting from different writing habits, styles, educational level, moods, health status and other conditions of the writers. In addition, other items such as the writing instruments,

17 writing surfaces and scanning methods, along with other problems such as unwanted characters overlapping in sentences, in any language, make handwritten scripts recognition very difficult.

Based on obeying some limitations and rules, handwritten texts are divided into two sub-categories: constrained and unconstrained. There are some regularities in constrained writing style. By this reason, recognition operation for this type of texts is faster, easier, and more accurate in comparison with unconstrained texts. However, the general shapes of characters in this case are not similar to real situation. For example, character dimensions, character slants, and so on have been predefined.

Recognition of handwritten texts is much more difficult than recognition of printed texts (Ghods & Kabir, 2013a). Different writing styles lead to the distortion in input patterns from the standard patterns (Mandal & Manna, 2011). Therefore, unlike printed OCR systems, that they have been matured, handwritten OCR systems are still open research area and there is a long way to their final goals. Consequently, OCR systems for handwritten texts do not perform as well as OCR systems for printed texts (Mandal &

Manna, 2011; Fouladi, Arrabi & Kabir, 2013). Undoubtedly, OCR systems for printed texts are more established, while OCR systems for handwritten texts continue to attract more research efforts (Alginahi, 2012).

2.1.2 Online / Offline OCR Systems

Generally, OCR systems are divided into two main groups "Offline" and "Online", based on when the recognition operation is carried out. In offline method, recognition operation is performed after the writing or printing process is completed, but in online systems,

18 recognition is carried out in the same time of entering the data to the system (Shah &

Jethava, 2013).

In online OCR systems, information is imported into system by using digitized tablets and a stylus pen. In every moment, x and y coordinates of the pen tip on the page, the value of pen pressure on the page, angle and direction of writing and so on, are useful information for this group of systems. In this case, there are some important information related to writing characters in chronological order such as order of writing strokes, number of strokes, speed and direction of pen, and location of complementary parts related to main parts of a character. Hence, online OCR systems are usually more accurate when compared to offline systems (Baghshah, Shouraki & Kasaei, 2005, 2006; Faradji, Faez & Nosrati, 2007; Faradji, Faez & Mousavi, 2007; Halavati & Shouraki, 2007; Harouni, Mohamad &

Rasouli, 2010; Samimi, Khademi, Nikookar & Farahani, 2010; Nourouzian, Mezghani, Mitichi & Jonston, 2006; Ghods & Kabir, 2010, 2013b, 2013c).

In offline systems, both type of printed or handwritten texts are converted to graphical files by special devices such as scanner, digital cameras, or even cell phones, and then imported to an OCR system. In this type of OCR systems, recognition operations are performed after writing process. Hence, no auxiliary information associated with images are available to the system. Offline recognition of handwritten cursive text (such as Farsi text) is very more difficult than online recognition, because the formers must deal with 2D images of the text, after it has already been written (Lorigo & Govindaraju, 2006). Offline recognition of unconstrained handwritten cursive text must overcome many difficulties such as similarities of distinct letter shapes, unlimited variation in writing style, characters overlapping and interconnection of neighboring letters. However, based on their nature, the offline OCR

19 systems are easier to apply than the online systems. Hence, most of the researches have been carried out on offline systems, and this is also true for the Farsi OCR (FOCR) systems (Abed i , F a e z , & Mozaffari, 2009; Alaei, Nagabhushan & Pal, 2010a; Bahmani, Alamdar, Azmi & Haratizadeh, 2010; Enayatifar & Alirezanejad, 2011; Jenabzade, Azmi, Pishgoo & Shirazi, 2011; Pourasad, Hassibi & Banaeyan, 2011; Rajabi, Nematbakhsh &

Monadjemi, 2012; Salehpor & Behrad, 2010; Ziaratban & Faez, 2012).

The available useful information in online recognition systems have caused researchers try to extract some of these information for offline systems, too. They try to develop some approaches to find distribution of the image pixels (identical to online methods) from available information in offline handwriting texts. For an example, Elbaati, Kherallah, Ennaji and Alimi (2009) tried to find strokes temporal order from a scanned handwritten Arabic text for using them in an offline Arabic OCR system. They extracted some features such as end stroke points, branching points, and crossing points from the image skeleton.

After that, they tried to find the order of strings in each stroke. They used also genetic algorithm for finding the best combination of stroke order.