REAL-TIME MALAYSIAN SIGN LANGUAGE
RECOGNITION SYSTEM USING MICROSOFT KINECT 360 BASED ON
LOCALLY LINEAR EMBEDDING AND ARTIFICIAL NEURAL NETWORK MODEL
BY
MOSTAFA KARBASI
A thesis submitted in fulfillment of the requirement for the degree of Doctor of Philosophy
(Computer Science)
Kulliyyah of Information and Communication Technology International Islamic University
Malaysia
MAY 2017
ii
ABSTRACT
Deaf people or people with hearing loss have a major problem in everyday communication. Sign Language (SL) is a common communication method for deaf people. Many attempts have been made with SL translator to solve of communication gap between normal and deaf people and ease communication for deaf people. The system is able to match and compare the input sign trajectory with each of the prototype sign trajectory contained in the database with lower error rate. This is achieved by extracting a number of static and dynamic features from right hand and left hand. This contribution tries to introduce an SL translator, especially for static and dynamic MSL by using Kinect 360 technology and Native signers with MSL database which have been created in this research. Iterative method has been used for data denoising for depth information. The result for denoising data has been reduce from 307200 to 160000 value. HOG and GA are used as feature extraction for static sign recognition. SVM classifier is used for training and testing the developed system using static signs. Accuracy result for static signs using HOG is 99.37%, GA is 62.92% and GA+HOG is 93.14%. LLE and PCA feature extraction has been used for dynamic sign recognition which improved accuracy result much better (it is mentioned that LLE features have been used for the first time for dynamic sign recognition). Three types of classifier such as MLP, CFNN and SVM are used to test and implement dynamic sign recognition. Accuracy results are 92.30%, 88.50% and 82.70% for MLP, CFNN and SVM respectively. The developed MSL recognition system was tested using 10 dynamic words and 24 static alphabets. The developed MSL recognition system has attained a significant performance in terms of recognition accuracy and speed that allow a real time translation of signs into text.
iii
ثحبلا ةصلاخ
ةراشلإا ةغل .يمويلا لصاوتلا يف ةريبك ةبوعص نوهجاوي ةيعمسلا ةقاعلإا يوذ وأ ّمُصلا نإ ةليسو لكشت (SL)
م ِجرتم جمانرب للاخ نم تلاواحملا نم ديدعلا ترج دقل .مَمَصلا يوذل ةك َرتشم لصاوت ةوجفل لح داجيلإ SL
رثكأ لصاوتلا لعجو ّمُصلاو سانلا ةماع نيب لصاوتلا ةراشلإا راسم ةنراقمو ةقَباطم ةردق هيدل ماظنلا .ّمُصلل ةلوهس
لا ةراشلإا تاراسم نم لك عم ةلخدملا نع كلذ ققحتيو .أطخلا لدعم ضافخنا عم تانايبلا ةدعاق يف ةدراولا ةيجذومن
ميدقت لواحت ةمهاسملا هذه .ىرسيلا ديلاو ىنميلا ديلا نم ةكرحتملاو ةتباثلا تازيمملا نم ددع صلاختسا قيرط م ِجرتم صاخ SL
ب ايجولونكت مادختسا للاخ نم كرحتملاو تباثلا MSL
Kinect 360 ةغل يمدختسمو
ةراشلإا
تانايب ةدعاق عم ةلاصلأاب تانايبلا ءاضوض ليلقتل راركتلا بولسأ مدخُتسا دقو .ثحبلا اذه يف تئشنُأ دق يتلاMSL
نم ةميقلا ضافخنا تانايبلا ءاضوض ليلقتل ةجيتنلا تناكو .ًاقمع رثكأ تامولعم ىلع لوصحلل 307200
ىلإ
160000 مادختساب نيجه بولسأ قيبطت ىرج دقو .
وHOG مدخُتساو .ةتباثلا تاراشلإا زييمت ةيلمع نيسحتلGA
وHOG مدخُتساو .ةتباثلا تاراشلإا زييمت ةيلمع لجلأ كلذو تازيمملا صلاختسلا GA
فينصت جمانربكSVM
مادختساب ةتباثلا تاراشلإل ةجيتنلا ةقد ةبسن .ةتباثلا تاراشلإا مادختساب رِّوُط يذلا ماظنلا رابتخاو بيردتل يهHOG
99.37 مادختسابو ،٪
يه GA 62.92 مادختسابو ٪ HOG + GA
يه 93.14 مدخُتسا دقو .٪
وLLE PCA
نأ كلذك ركُذ( ةجيتنلا ةقد نيسحت ىلإ ىدأ امم ةكرحتملا تاراشلإا زييمت ةيلمع لجلأ كلذو تازيمملا صلاختسلا عاونأ ةثلاث مادختسا ىرجو .)ةكرحتملا تاراشلإا زييمت لجأ نم ةرم لولأ مدخُتسا دق تازيمملا صلاختسلاLLE
لثم فينصتلا جمارب نم و ،MLP
وCFNN ةقدلا ةبسن .ةكرحتملا تاراشلإا زييمت ةيلمع قيبطتو رابتخلاSVM
تناك جئاتنلا يف 92.30
و ،٪
88.50 و ٪ 82.70 جماربل ٪ و ،MLP
وCFNN زييمتلا ماظنو .يلاوتلا ىلعSVM
ر ّوَطُملا للاخ نم هرابتخا ىرج دق MSL
10 و ةكرحتم تاملك 24
زييمتلا ماظن ققح .كلذك ةيدجبلأا نم ةتباث
ر ّوَطُملا نمزلا يف ةبوتكم صوصن ىلإ تاراشلإا ةمجرتب حمسي امم ةعرسلاو زييمتلا ةقد يف ًازيمم ًءادأ MSL
.يقيقحلا
ABSTRA
iv
APPROVAL PAGE
The thesis of Mostafa Karbasi has been approved by the following:
_____________________________
Asadullah Shah Supervisor
_____________________________
Sara Bilal Co-Supervisor
_____________________________
Azzeddine Messikh Internal Examiner
_____________________________
Mustafa Mat Deris External Examiner
_____________________________
Saadeldin Mansour Gasmelsid Chairman
v
DECLARATION
I hereby declare that this thesis is the result of my own investigations, except where otherwise stated. I also declare that it has not been previously or concurrently submitted as a whole for any other degrees at IIUM or other institutions.
Mostafa Karbasi
Signature... Date...
vi
AGE
INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA
DECLARATION OF COPYRIGHT AND AFFIRMATION OF FAIR USE OF UNPUBLISHED RESEARCH
REAL-TIME MALAYSIAN SIGN LANGUAGE RECOGNITION SYSTEM USING MICROSOFT KINECT 360 BASED ON LOCALLY LINEAR EMBEDDING AND ARTIFICIAL NEURAL
NETWORK MODEL
I declare that the copyright holders of this dissertation are jointly owned by the student and IIUM
Copyright © 2017 (MOSTAFA KARBASI) and International Islamic University Malaysia. All rights reserved
No part of this unpublished research may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without prior written permission of the copyright holder except as provided below
1. Any material contained in or derived from this unpublished research may be used by others in their writing with due acknowledgement.
2. IIUM or its library will have the right to make and transmit copies (print or electronic) for institutional and academic purposes.
3. The IIUM library will have the right to make, store in a retrieved system and supply copies of this unpublished research if requested by other universities and research libraries.
By signing this form, I acknowledged that I have read and understand the IIUM Intellectual Property Right and Commercialization policy.
By signing this form Affirmed by Mostafa Karbasi
………. ………..
Signature Date
vii
TO MY BEST FRIENDS DR. ZAHRA ESLAMPANAH AND HESAM SYED ZADEH GHOMI FOR ALL THEIR SUPPORTS
viii
ACKNOWLEDGEMENTS
First of all, word are inadequate to express my deepest gratitude to my supervisor Prof.Dr.
Asadullah Shah and my Co-supervisor Dr. Sara Bilal, Department of Mechatronics Engineering, International Islamic University Malaysia. I thank her for invigorating suggestions and guidance. Also, a special thanks to Dr. Zeshan Bhatti for their continuous support, encouragement and leadership, and for that, I will be forever grateful. I express my sincere thanks to Ms. Rose and Center of Deaf School gave me opportunities and helping me throughout the recording signs. Also, I give my heartfelt thanks to Dr. Ihsan Yassin and Dr. Azlee Zabidi, who were somehow responsible for successful completion of this research work. In addition, Also, I thank who contributed to my thesis directly and indirectly including all my friends. Particular reference must be made to brother, Dr.
Ahmad Waqas for his technical assistance. I also thank the IIUM Research Management Center (RMC) for financial support. I got benefit greatly from IIUM library, lectures in the department of computer science and I thank them greatly for their support and help. I would like to express my sincere gratitude to administrator society of interpreters for the deaf Cindy Leong for the continuous support for recording data from student, for her patience, motivation, and immense knowledge.
I thank Shivani Sharma and Manoj who taught me a true way to lead life and showed me there is no need of any nationality to help others. I am really thankful to all the participants for their patience and contribution in the study, without whose support this study would not have been possible and special thank from Ali Shayesteh nam, a person whom devote his time & effort for the correction and setting the contents.
Last but not least, I would like to express my sincerest appreciation to Mahnaz, Mr. Omid Jafarzadeh, Dr. zahra and Hesam who have directly and indirectly contributed to the successful completion of this thesis.
ix
TABLE OF CONTENTS
Abstract...ii
Abstract in Arabic...iii
Approval...iv
Declaration...v
Copyright...vi
Dedication...vii
Acknowledgment...viii
List of Table...xii
List of Figures...xiv
List of Abbreviations...xvi
CHAPTER ONE...1
1.1 Overview...1
1.2 Problem Statement...3
1.3 Research Objective...5
1.4 Research scope...5
1.5 Research Methodology...6
1.6 Thesis Organization...9
CHAPTER TWO...10
2.1 Introduction...10
2.2 Brief History of Malaysian Sign Language...11
2.3 Kinect Device...13
2.3.1 The Depth Sensor...14
2.3.2 The Kinect Microphones...14
2.3.3 Recognizing People with Kinect...15
2.4 Sign Language Database Collection...16
2.5 Denoising Depth Data...17
2.5.1 Noise Model for the Kinect...20
2.5.1.1 Geometric Model. ...21
2.5.1.2 Empirical Model...24
2.5.3.3 Statistical Model...24
2.6 Hand Detection...28
2.7 Feature Extraction...37
2.8 Human Posture and Human Gesture...43
2.8.1 Real Time Recognition System Using Kinect...47
2.8.2 Real Time Application using Kinect...47
2.9 Summary...49
CHAPTER THREE...50
3.1 Introduction ...50
3.2 Shadow Modelling...51
3.2.1 Caused of the Shadow...51
3.3 Static Sign...52
3.3.1 Threshold Method...53
3.3.2 HOG Feature...54
3.3.3 Geometric Features...56
x
3.3.4 SVM...58
3.4 Dynamic Sign Recognition...62
3.4.1 Feature Extraction...62
3.4.2 Principal Components Analysis (PCA) ...62
3.4.3 Locally Linear Embedding (LLE) ...65
3.4.4 Multilayer Perceptron...68
3.4.4.1 The TANSIG Activation Function...71
3.4.4.2 The NW Algorithm...72
3.4.4.3 Scale Conjugate Gradient Algorithm...74
3.4.4.4 The Early Stopping Algorithm...77
3.5 Cascade Forward Networks...77
3.6 Testing Methods...78
CHAPTER FOUR...80
4.1 Introduction...80
4.2 MSL Database Development Using Kinect...82
4.2.1 Data Collection Setup...83
4.2.2 Structure of Database...84
4.2.3 Organization of Database...85
4.2.4 GUI for Data Collection...86
4.2.5 Data Processing...87
4.3 Static Sign Detection...87
4.3.1 Static Hand Segmentation...88
4.4.Static Feature Extraction and Classification...89
4.5 Dynamic Feature Representation and Selection...90
4.6 Dynamic Sign Classification...91
4.7 Real-Time Implementation...93
4.8 Grammar for Static and Dynamic Signs...93
4.9 Summary...94
CHAPTER FIVE...95
5.1 Introduction...95
5.2 Malaysian Sign Language Database Interface Result...96
5.3 Denoising...104
5.3.1 Background Elimination...104
5.3.2 Shadow Removal...106
5.4 Compression with Other Denoising Approaches...108
5.5 Blob Detection...110
5.6 Feature Extraction...112
5.6.1 Static Sign Features...112
5.6.2 HOG Feature...112
5.6.3 Geometric Feature...116
5.7 Recognition Accuracy of Static Signs...119
5.8 Comparison with Other Static Signs Approaches...126
5.9 Real-time Application of Static Sign...128
5.10 Summary...130
CHAPTER SIX...131
6.1 Introduction...131
xi
6.2 Dynamic Signs Features...132
6.2.1 PCA Features Selection for Dynamic Signs...132
6.2.2 LLE Feature for Dynamic Signs...134
6.3 Dynamic Sign Recognition Using MLP...136
6.3.1Experimental Result Using ANN Classifier...137
6.3.2 Experimental Result Using PCA Feature with MLP...141
6.3.3 Experimental Result Using LLE with MLP...144
6.3.4 The Comparison of Different Features with MLP...147
6.4 Dynamic Sign Recognition Using CFNN...148
6.4.1 Experimental Result for CFNN...149
6.4.2 Experimental Result Using PCA with CFNN...153
6.4.3 Experimental Result Using LLE with CFNN...156
6.4.4 The Comparison of Different Feature Selection Method with CFNN...157
6.5 Dynamic Sign Recognition Using SVM...157
6.5.1 Experimental Result for SVM...158
6.5.2 The Comparison of Different Features with SVM...159
6.6 Accuracy Comparison of Different Classifier with Different Features...160
6.6.1 Recognition Accuracy for Dynamic signs Using Different Classifier....160
6.6.2 The Effect of PCA Features on Different Classifiers...160
6.6.3 The Effect of LLE Features on Different Classifier...161
6.7 Recognition Accuracy of Dynamic Sign...162
6.7.1 The Effect of Features on the Accuracy of Two Dynamic Signs....162
6.7.2 The Effect of Features on the Accuracy of Four Dynamic Signs...163
6.7.3 The Effect of Features on the Accuracy of Two Hands Signs...164
6.8 Comparison with Other Dynamic Signs Approaches...167
6.9 Real Time System Implementation...169
6.10 Dynamic Sign Real-time System Implementation...169
6.11 Summary...173
CHAPTER SEVEN...174
7.1 Conclusion...174
7.2 Contribution to Knowledge...175
7.3 Recommendation for Future Studies...176
REFERECNES...177
Appendix A...189
Appendix B...190
Appendix C...192
Appendix D...196
xii
LIST OF TABLES
Table No.
Table 2.1 Brief History of Malaysian Sign Language 11
Table 2.2 Summarize of Different Methods used for Denoising 26 Table 2.3 Summary of All The Relevant Work Done on Hand Detection 36
Table 2.4 Real-Time SL Recognition System Using DTW 46
Table 2.5 Some Commercial HCI Application Using Kinect 48
Table 4.1 Isolate Word Structure 85
Table 4.2 Sentence Structure 85
Table 4.3 MLP and CFNN Specification for Dynamic Sign Recognition 92 Table 5.1 Comparison of Denoising Method with Other Existing System 109 Table 5.2 Extracted Geometric Feature From Letter ‘A’,’B’,’C’,’F’,’G’,’U’, V’,’M 118 Table 5.3 Overall Result for Training of 24 Static Signs Result Using HOG 120 Table 5.4 Overall Result for Testing of 24 Static Signs Result Using HOG 121 Table 5.5 Overall Result for Training of 24 Static Signs Result Using GA 121 Table 5.6 Overall Result for Testing of 24 Static Signs Result Using GA 122 Table 5.7 Overall Result for Training of 24 Static Signs Result Using HOG+GA 122 Table 5.8 Overall Result for Testing of 24 Static Signs Result Using HOG+GA 123 Table 5.9 The Recognition Accuracy of 8 Characters Using HOG Feature 123 Table 5.1 The Recognition Accuracy of 8 Characters Using GA 121 Table 5.11 The Recognition Accuracy of 8 Characters Using HOG+GA 125
Table 5.12 Overall System Accuracy for All Experiment 125
Table 5.13
Recognition Accuracy Comparison of the developed System and
Existing System 127
Table 6.1 PCA Result on Data Kinect 134
Table 6.2 Training Result-MLP 137
Table 6.3 Validation Result-MLP 138
Table 6.4 Testing Result-MLP 138
Table 6.5 Maximum System Accuracy Using Different PCA with MLP 143
Table 6.6 System Accuracy Using LLE Feature Using MLP 144
Table 6.7 The Comparison of Different Feature with MLP 148
Table 6.8 Training Result-CFNN 149
Table 6.9 Validation Result-CFNN 150
Table 6.10 Testing Result-CFNN 150
Table 6.11
Maximum System Accuracy Using Different PCA with CFNN Classifier
With Hidden Layer 12 and 15 153
Table 6.12
Maximum System Accuracy Using Different PCA with CFNN with
17 and 18 155
Table 6.13 System Accuracy Using LLE Feature Using CFNN 156
Table 6.14 The Comparison of Different Feature with CFNN Classifier 157
Table 6.15 Training Result-SVM 158
xiii
Table 6.16 Testing Result-SVM 159
Table 6.17 The Comparison of Different Feature with SVM 159
Table 6.18 Comparison Between Best Different Classifiers 160
Table 6.19 Comparison Between Different Classifiers Using PCA 161 Table 6.20 Comparison Between Different Classifiers Using LLE 161 Table 6.21 Overall Result for ‘I’ and ‘Father’ Using Only PCA Feature Using MLP 162 Table 6.22 Overall Result for ‘I’ and ‘Father’ Using only LLE Feature Using MLP 162 Table 6.23 Overall Result for ‘I’, ‘God’, ‘You’ and ’Sister’ PCA Feature-MLP 163 Table 6.24 Overall Result for ‘I’, ‘God’, ‘You’ and ’Sister’ LLE- MLP 164
Table 6.25 Five Signs Using Two Hands (Sara Bilal-2012) 165
Table 6.26 Overall Only PCA Feature with MLP 166
Table 6.27 Overall Result LLE Feature Using MLP 166
Table 6.28
Recognition Accuracy Comparison of The Developed System with
Existing System 168
xiv
LIST OF FIGURES
Figure No..
Figure 1.1 Prototype of string glove 2
Figure 1.2 Wired glove Being Used as a Mouse 3
Figure 1.3 Overall Stage of MSL Translator 6
Figure 2.1 A Kinect sensor 13
Figure 2.2 The Dot Pattern On the Sofa Arm 14
Figure 2.3 Skeleton Information Retrieved Using the Kinect Software 15
Figure 2.4 Different Types of Model Noise 20
Figure 2.5 Disparity-Depth Model 21
Figure 2.6 Hand Detection by Filtering and Cluster Merging 29 Figure 2.7 Display Adaptive Hand Detection 30
Figure 2.8 Hand Blob Detected Using Division by Shape 40
Figure 2.9 Template Matching Based Tracking Logic 41
Figure 2.10 Anthropometric Ratios of Typical Human Body 42 Figure 2.11 The Stick Model Used for human Upper Body Skeleton Fitting 43
Figure 3.1 Cause for the Shadow 52
Figure 3.2 Optimal Separating Hyperplane 59
Figure 3.3 Soft Margin Classification 60
Figure 3.4 Graph of Linear and Nonlinear Mapping 61
Figure 3.5 Scatter Plot in the Original Axes 63
Figure 3.6 Scatterplot in the New Axes 63
Figure 3.7 Mapping High Dimensional Input to Low Dimensional Via LLE 65
Figure 3.8 Locally Linear Reconstruction 67
Figure 3.9 A Three-Layer MLP Architecture 69
Figure 3.10 An Artificial Unit with Additional Bias Term 69
Figure 3.11 The TANSIG Activation Function 71
Figure 3.12 The Cascade Learning Architecture 78
Figure 3.13 Testing Mode 78
Figure 4.1 Methodology Used for Static and Dynamic Signs 82
Figure 4.2 Process of Developing MSL Database 83
Figure 4.3 Position of Signer Front of Camera 84
Figure 4.4 Camera Adjustment for Acquisition 86
Figure 4.5 Flow Chart of Static Sign Detection 88
Figure 4.6 Overview of Implementation PCA Feature 91
Figure 4.7 Overview of Implementation LLE Feature 91
Figure 5.1 Camera Adjustment 97
Figure 5.2 Skeleton Recording for Dynamic Signs 98
Figure 5.3 Samples of Static signs Collected 100
Figure 5.4 Sample Frames for Dynamic Sign (Divorce) 101
Figure 5.5 Sample Frames for Dynamic Sign (Father) 101
Figure 5.6 Sample Frames for Dynamic Sign (Sister) 102 Figure 5.7 Sample Frames for Dynamic Sign (Triangle) 102
Figure 5.8 Source (Internet) 103
Figure 5.9 RGB Image and Histogram of Image 105
xv
Figure 5.10 Depth Map Image and Histogram of Image 105
Figure 5.11 Background Subtraction and Histogram of Image 105
Figure 5.12 Steps of Background Subtraction 106
Figure 5.13 Shadow Removal 106
Figure 5.14 Comparison of Denoising and Noising of Depth Data 107 Figure 5.15 Different Hand Posture Using Threshold Method 111 Figure 5.16 Feature Extraction Using HOG for Eight Characters 114 Figure 5.17 Feature Extraction Using HOG for SVM Classification 115 Figure 5.18 Feature Extraction Using Geometric Feature 117 Figure 5.19 Real-Time Presentation of Five Characters 129 Figure 6.1 Skeleton Data Recorded by Kinect (No of Hidden Units=15) 132
Figure 6.2 PCA=96, No of Hidden Units=15, Features=12 133
Figure 6.3 Actual Data Recorded by Kinect (Neighbour=10, No of Hidden=20) 135
Figure 6.4 Dimension=80 , Neighbour=50 136
Figure 6.5 Mean Square Error 139
Figure 6.6 Error Histogram 140
Figure 6.7 Gradient Plot 140
Figure 6.8 Validation Plot 141
Figure 6.9 Average System Accuracy with Respect to Neighbours 146
Figure 6.10 Number of Features Versus Dimension 147
Figure 6.11 Mean Square Error 151
Figure 6.12 Gradient, Validation and MSE 152
Figure 6.13 System Implementation 169
Figure 6.14 Real-Time Dynamic Sign Recognition for ‘divorce’ Sign 170 Figure 6.15 Real-Time Dynamic Sign Recognition for ‘house’ Sign 171 Figure 6.16 Real-Time Dynamic Sign Recognition for ‘Sister’. 172
xvi
LIST OF ABBREVIATIONS
AI Artificial Intelligence
ANFIS Adaptive Neuron Fuzzy Interface System ANN Artificial Neural Network
ASL Arabic Sign Language
CCA Connected Component Analysis
CD Compact Disk
CM Committee Machines DOF Degree of Freedom EBM Elliptical Boundary Model EFT Elliptic Fourier Descriptor FMM Fast Marching Method GEM Global Expert network GMM Gaussian Mixture Model HCI Human Computer Interaction HGR Human Gesture Recognition
HMM Hidden Markov Model
HOG Histogram of Gradient HP Human Posture
ICA Independent Component Analysis IR Infrared
IT Information Technology LEN Local Network Export LLE Least Linear Embedded LM Levenberg-Marquardt
MEE Minimum Enclosing Ellipsoid MLP Multi-Layered Perceptron MSL Malaysian Sign Language
MSLT Malaysian Sign Language Translator NMD Non-measured Depth
PCF Part Classification Forest
RBFANN Radial Basic Function Neural Network RDF Random Decision Forest
RFD Randomized Decision Forest SCF Shape Classification Forest
SL Sign Language
SLI Sign Language Interpreter
SLR Sign Language Recognition
SOM Self-Organization Map
xvii SV Support Vector
SVM Support Vector Machine TOF Time of Flight
UN Union Nation
UNCRPD UN Convention on the Right of People with Disabilities USB Universal Serial Bus
VCD Video Compact Disk
VLSI Very Large Scale Integration
1
CHAPTER ONE INTRODUCTION
1.1
OVERVIEW
More often than not, deafness refers to the inability to understand speech through hearing even when sound is amplified. Once recognized, it usually takes a parent a long time to meet the needs of the deaf child. Communication becomes the most difficult of all. The next step inevitably taken is learning how to sign. How do you help a child understand that he or she is deaf and that the best way to communicate is through signing?
Normal people can barely make good communication with deaf people. These people show their feels and even speak in own way. Sign Language or SL is a language used by deaf people to talk together, this language is also called gesture language. Nearly 40000 deaf people registered by December 2011 in Malaysia. This country had undertaken the United Nations (UN) convention on the right of the disabled and decide on giving this people normal life like any other in society in 2008 (Act 685), under Act 685, the government must provide proper and easy approach for them; they also need more help in understanding Sign Language Interpreter (SLI) in the country.
For a better enhancement of SLI, there are some studied in local universities (Bilal, Akmeliawati, El Salami, & Shafie, 2011) (Maarif, Akmeliawati, & Bilal, 2012) and a minority of them have good understanding of SL. There are many things that can help them gain a better life through the media and other communication tools which can help them or even translate the SL (Hilzensauer, 2006) (S. C. Ong & Ranganath, 2005) (S. C.
Ong & Ranganath, 2005). Studies on SL have been done in many different cases, for instance, the MSL recognizer tools have been made (Akmeliawati, Ooi, & Kuang, 2007;
2
Wang, Chen, & Gao, 2006; Werapan & Chotikakamthorn, 2004). Also in another study the sign database is highlighted and prioritized significantly (Al Qodri Maarif, Akmeliawati,
& Bilal, 2012).
There are some techniques used for recognition of Malaysian sign language by some researchers in the last few years. The existing Malaysian Sign Language Translator (MSLT) system generally uses the following:
1. Data Gloves
(Kadous, 1995), (J.-S. Kim, Jang, & Bien, 1996),(R.-H. Liang & Ouhyoung, 1998), and (Kuroda, Tabata, Goto, Ikuta, & Murakami, 2004) use data gloves/wired gloves for different sign language recognition. These glove are designed with wires to help the SL system. These wires send a signal to the computer and different types of sensors help by setting the finger movement, global position and angle data of gloves.
Figure 1.1 shows the model for string gloves. In this instrument all movements translate to a number on the machine; in this way body language can be classified into information which realizes the SL.
Figure 1.1 Prototype of String Gloves (Kuroda et al., 2004)
3
Figure 1.2 Wired glove as a Mouse (Wikipedia)
2. Visual Based Approaches
Different type of cameras such as Kinect and red, green, blue (RGB) are used to capture video from signer standing in front from of a camera (Lang, Block, &
Rojas, 2012; Starner, Weaver, & Pentland, 1998),(Bauer & Hienz, 2000; Zafrulla, Brashear, Starner, Hamilton, & Presti, 2011). This type of camera has some advantages over data-glove as users can move hands freely and use a bare hand for signing activity. Kinect camera is more suitable for sign recognition in case of light elimination and high rate accuracy for recognition. By using this approach, lip recognition and face recognition can be implemented. In addition, this approach provides addition features such as position of signer’s hand with other parts of the body and upper body detection as well.
1.2
PROBLEM STATEMENT
Sign language is an important language used daily by the hard-of-hearing people as a means of communication. They use signs to communicate with their family members, friends and the general public. Unfortunately, there is a lack of application, especially real-time tools to help the hard-of-hearing communities and others who are interested to
4
learn sign language in Malaysia (Holmes, 2007). Without such application, learners may face difficulty when learning sign language.
Books are not able to illustrate the signing of words clearly and accurately because the sequences of signing are illustrated using drawings and arrows. Hence, sign language learners may not be able to understand these drawings and follow the arrows to sign the words correctly. Also, each individual may perceive and sign a word in different ways (Jaklic et al., 1995). On the other hand, videos stored in VCD has high compression rate which caused the video quality to be poor, while compact disc CD, is vulnerable to degradation resulting from heat, humidity, dust, and human mishandling conditions such as scratched, cracked, and bent (Shelly et al., 2007).
Various researchers have tried to implement automatic sign language translator (ASLT) with high accuracy using multiple artificial method and variety of devices.
Unfortunately, these methods could not reach reasonable results because of many reasons such as inability of devices and lack of different stages such as hand detection, feature extraction, gesture recognition and lack of standard database. Computer vision is an active field of study and has generated many exciting results which have increased the understanding of complex and remarkable task of interpreting images. This research attempt effect to apply vision system theory to computer vision method to develop ASLT system that can be used for hard-of-hearing people to communicate with normal people.
However, advancement in Information Technology (IT) and production of graphical design tools allow us to develop an attractive and useful real-time textual representation of Malaysian sign language for communication between the hard-of- hearing and the general public.
5
Lack of ASLT systems using Kinect can be used as an alternative communication method and can help to establish communication between hearing/speech impaired people and normal people. It could assist both societies to interact fast during emergency situations and avoid misunderstanding.
1.3 RESEARCH OBJECTIVES
This research aim is to implement a system which can help deaf people can communicate to normal people. This aim can be subdivided into the following objectives:
1. To develop standard database for MSL by using Kinect 360.
2. To develop an iterative method for shadow removal.
3. To develop hybrid method for feature extraction for static signs.
4. To implement new feature selection for dynamic signs.
5. To develop an algorithm for static and dynamic MSL recognition.
6. To evaluate overall performance of the SL recognition system.
1.4 RESEARCH SCOPE
Real-time recognition and textual representation of Malaysian Sign Language is a stand- alone application system that runs on Windows 7 and Windows 10 platforms. The project focuses on recognition of static, dynamic and textual presentation of each sign on the screen. signing words taken from the book entitled “Bahasa Isyarat Malaysia,”
published by the Malaysian Federation of the Deaf (2000). This project consists of different stages to recognize MSL. The iterative method has been developed to remove
noise from depth frame. The threshold algorithm has been defined for hand segmentation.
Different features for dynamic signs such as locally linear embedding (LLE) and
6
principle component analysis (PCA) are used for a variety of gestures to improve recognition. For static signs we have implemented histogram oriented gradient (HOG) for feature extraction and hybrid method feature with geometric feature (GA) and HOG has been implemented. Finally, the system is designed to recognize static signs and dynamic signs by using SVM and MLP respectively to complete communication between impaired people and normal people.
1.5 RESEARCH METHODOLOGY
Figure 1.3 tries to explains the MSL system which include seven stages like literature review, SL database collection, denoising depth data, hand detection, features extraction, training and recognition of SL and testing and evaluating system.
Figure 1.3 Overall Stage of MSL Translator
7 Stage 1: Literature Review
In recent years, there have many methods developed to estimate the mapping between hand shape and the configuration of joints and palm orientation for sign language recognition. There are many advantages and disadvantages in all sign language recognition, which are related to database, denoising, hand detection, feature extraction and gesture recognition. The different methods used for sign language recognition are also explained.
Stage 2: SL Database Collection
The first step for sign language recognition is having standard data set for proper system, but it wasn’t available standard database which is recorded by Kinect device was not available. Thus, we made data set with 24 static signs, 10 dynamic signs.
Each sign was repeated 5 times with different students. During the recording process, the environment did not change so it can be exactly like a real environment. The signers were hearing/speech impaired persons from the society of deaf school.
Stage 3: Denoising Depth Data
Error in depth often causes problems in video and 3D images, which reduces the image quality. Therefore, this is kind of images include broken object, incomplete edges and hole problems that cannot present good features for computer vision processing. In this research, iterative method for removing shadow was introduced and based on the distance we tried to remove the background to delete unwanted areas.
Stage 4: Hand Detection
Hand detection is a very crucial stage for sign language recognition because hands are moving freely when impaired people try to sign. We tried to use depth data with