CHAPTER 6: SYSTEM TESTING 6.1 Verification Plan
6.3 Live Based Testing
In this section, the classification will be done in live based. The total time took to perform this experiment is 4 minutes and 33 seconds, where the driver will perform all the driver actions inside the vehicle and the system will do the classification. Unlike the section in 6.2 that only do classification in short clip video that took out all the frames, this section will only perform the classification on a certain frame after previous frame is done. Which means, during the processing on a frame, the camera will only grab a new frame on current driver action for process. Figure 6.3a shows frames that were captured by Raspberry Pi during live classification.
Figure 6.3a: Frames Obtained from Raspberry Pi
In Raspberry Pi, the total frames that can processed within 4 minutes and 33 seconds is 35, where the average time required to compute one frame is 2.25 seconds. The reason for taking so long time to process one frame is because the human pose estimate framework.
If the framework contains too less convolution layer which might speed up the framework but the detection on human body part will be very worst.
Table 6.3: Confusion Matrix on Classification Results True / Actual
Safe Driving Texting - Left Phone - Left Texting - Right Phone - Right Operating Radio Drinking / Eating Reaching Behind Hair and Makeup Talking to Passenger
Table 6.3 shows the confusion matrix that is produced after did the classification on all 35 frames. It shown that safe driving and talking to passenger is very unstable. It is because the pose frame generated from these two actions is very similar hence causing this two actions keep misclassified into each other’s. But since talking to passenger is not considered as dangerous tasks during driving, hence buzzer won’t make alert which make no different for driver during driver with the system operating. Moreover, texting to the phone on right hand and hair make up also keep misclassified due to the similarity of pose frame too. In addition, there was also contain similarity in between texting on left hand side and eating because both of these actions left hand position is almost similar.
CHAPTER 7: CONCLUSION
Malaysia had become one of the countries with highest death rates on the road. According to LUM (2019), Malaysia had the highest accident rate in Asia countries, just behind Thailand and Vietnam. The death rate due to traffic accident was increasing every year nowadays. Hence, Malaysia death rate which was related to the traffic must be resolve by reduce it in order for Malaysia to become a better and more successful country.
One of the reasons that cause traffic accidents was due to the secondary task performed by the driver. Due to the emerged of high technology nowadays, drivers like to enjoy themselves inside their smartphone, which draw away their attention from primary driving task. Other than using smartphone, talking to passengers, eating, reaching behind, and make up also the factors that leads to driver not paying attention. According to study done by Choudhary and Velaga (2019),driver face more traffic problem especially while doing texting task. Hence, a driver action detection system as an advanced driver assistance system come over to help the driver by alert the driver if secondary tasks is performed during driving. The purpose of this system is to decrease the traffic accident rate on the roadside. According to Hafetz et al. (2010), the driver being less likely to engage in traffic accident if not using smart phone while driving.
The objective of this project is to develop a real-time driver monitoring system which are able to monitor driver action in real-time by using computer vision technique. The system developed was able to capture the frame of driver activity, after that it will generate a human pose frame which link together human key points to represent as more meaningful input image for classification model compare to original image. Then the system is able to classify the driver current action and determine whether the driver is performing secondary tasks or not. Hence, it is able to verify whether to give alert to the driver. This system will be implemented on Raspberry Pi which run with Raspbian operating system, the program will be written in Python in (.py) file format.
the first class and last class which is safe driving and talking to passengers, other classes are consider as secondary task which might increase the traffic accident rate. Hence, the buzzer which connected to the Raspberry Pi will buzz when dangerous act is performed during driving.
For future enhancement and development, perhaps others mini motherboard which is similar to Raspberry Pi for example like Nano Jetson could be replaced for this system. It is because Nano Jetson contain GPU which belongs to Nvidia that had cuda device that can support any trained model to be run on GPU that can achieve faster speed during live process. In this case, instead of only using one frame for classification, few frames could be used for tracking driver behavior in order to further understand driver action. Since GPU provide powerful processing power for deep learning model, hence time taken to process one frame should be very fast which allowed system to do the tracking unlike Raspberry Pi that use CPU which compute slower that might took around 7 or even more seconds for tracking where this will failed to be a live processing framework. Besides that, night vision camera can be another pick to be implement into the system too. It can assist the driver even during the night time to ensure the camera could capture driver movement properly for better classification.
Current designed system is using only vision-based technique to perform driver action detection. Perhaps in future the system could also implement vehicle-based measure for example like steering wheel detection to detect whether driver hands are on steering or not.
Moreover, by integrate with other’s frameworks such as driver drowsiness detection, the system may be improved by further reduce car accidents rate happen all around the world.
In addition, it is suggested to add in extra datasets that could able to detect when driver is perform reverse parking. For current framework, when driver is performing reverse parking and during the time when head is turn to back, the model might classify it as reaching behind hence alert will be given. Hopefully in future some enhance could be done for example like integrate the system with vehicle system to pause the system when driver is performing reverse parking or included dataset where driving is performing reverse parking.
B, J. and M Patil, C., 2018. Video Based Human Activity Detection, Recognition and Classification of actions using SVM. Transactions on Machine Learning and Artificial Intelligence, 6(6).
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S. and Sheikh, Y. (2019). OpenPose:
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1-1.
Cdc.gov. (2018). Distracted Driving | Motor Vehicle Safety | CDC Injury Center. [online]
Available at: <https://www.cdc.gov/motorvehiclesafety/distracted_driving/>.
Chollet, F., 2017. Xception: Deep Learning with Depthwise Separable Convolutions.
In: Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE.
Cho Nilar, P., Thi Thi, Z. and Pyke, T., 2020. Skeleton Motion History based Human Action Recognition Using Deep Learning. In: 6th Global Conference on Consumer Electronics (GCCE 2017). Las Vegas: IEEE.
Choudhary, P. and Velaga, N. (2019). A comparative analysis of risk associated with eating, drinking and texting during driving at unsignalised intersections. Transportation Research Part F: Traffic Psychology and Behaviour, 63, pp.295-308.
D'Sa, Ashwin & Prasad, B. (2019). An IoT Based Framework For Activity Recognition Using Deep Learning Technique.
Hafetz, J., Jacobsohn, L., García-España, J., Curry, A. and Winston, F. (2010). Adolescent drivers’ perceptions of the advantages and disadvantages of abstention from in-vehicle cell phone use. Accident Analysis & Prevention, 42(6), pp.1570-1576.
Huang Y., Lai SH., Tai SH. (2019) Human Action Recognition Based on Temporal Pose CNN and Multi-dimensional Fusion. In: Leal-Taixé L., Roth S. (eds) Computer Vision – ECCV 2018 Workshops. ECCV 2018. Lecture Notes in
Jegham, I., Ben Khalifa, A., Alouani, I. and Ali Mahjoub, M. (2018). Safe Driving: Driver Action Recognition using SURF Keypoints. In: 2018 30th International Conference on Microelectronics (ICM). Tunisia: IEEE, pp.60-63.
Lawrence, N. (2018). Distracted driving, cellphones seen as factors in pedestrian deaths.
[online] The Star Online. Available at: <https://www.thestar.com.my/tech/tech- news/2018/07/02/distracted-driving-cellphones-seen-as-factors-in-pedestrian-deaths>.
Lee, J., Lee, J., Bärgman, J., Lee, J. and Reimer, B. (2018). How safe is tuning a radio?:
using the radio tuning task as a benchmark for distracted driving. Accident Analysis
& Prevention, 110, pp.29-37.
LUM, D. (2019). We have the third highest death rate from road accidents. [online] The
Star Online. Available at:
Malaymail.com., 2019. Traffic police: More than 280,000 road accidents nationwide in first half 2019 Malay Mail. [online] Available at:
Motus (2018). Car Accidents Increase 12.3 Percent with the Rise of the Always-Connected Mobile Workforce, Finds New Motus Distracted Driving Report. Boston: Business Wire.
Pang, Y., Syu, S., Huang, Y. and Chen, B., 2020. An Advanced Deep Framework for Recognition of Distracted Driving Behaviors. In: 7th Global Conference on Consumer Electronics (GCCE 2018). Las Vegas: IEEE.
Rolison, J., Regev, S., Moutari, S. and Feeney, A. (2018). What are the factors that contribute to road accidents? An assessment of law enforcement views, ordinary drivers’ opinions, and road accident records. Accident Analysis & Prevention, 115, pp.11-24.
Wang, L., Wang, Z., Xiong, Y. and Qiao, Y., 2016. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. In: The 14th European Conference on Computer Vision (ECCV). Amsterdam.
Who.int. (2018). Road traffic injuries. [online] Available at: <https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries>.
Yan, C., Coenen, F., & Zhang, B. (2016). Driving posture recognition by convolutional neural networks. IET Computer Vision, 10(2), 103–114. doi:10.1049/iet-cvi.2015.0175
Yan, S., Teng, Y., S.Smith, J. and Zhang, B. (2016). Driver Behavior Recognition Based on Deep Convolutional Neural Networks. In: 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).
Changsha: IEEE, pp.636-641.
Zhang, Y., Li, J., Guo, Y., Xu, C., Bao, J. and Song, Y. (2019). Vehicle Driving Behavior Recognition Based on Multi-View Convolutional Neural Network with Joint Data Augmentation. IEEE Transactions on Vehicular Technology, 68(5), pp.4223-4234.
FACULTY OF INFORMATION AND COMMUNICATION