Systematic Review of Convolutional Neural Networks and Digital Twinning for Automotive Human-Machine Interaction via Hand Gesture Recognition
Contributors
Dr Neethu P S
Keywords
Proceeding
Track
Engineering and Sciences
License
Copyright (c) 2026 Sustainable Global Societies Initiative

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Abstract
The issue with distracted driving in automobiles stems from the excessive complexity of modern technology’s' interaction. Individuals are attempting to devise methods of improving this interaction by using hand gesturing as a means of controlling various car features. Computer programs have been developed to aid in advancing this concept, and these programs are known as Convolutional Neural Networks (CNNs). The advancement of this concept also utilizes Digital Twin (DT) technology. The purpose of this study was to investigate what prior research has produced regarding the use of both CNNs and DT technology to assist in driving more safely. This study utilized an analysis of twenty-five research articles to determine prior research regarding the use of CNNs for the identification of hand gestures and DT for usage within automobiles. The section regarding the research methodology outlines the objectives and achieved results of the study. The findings revealed that CNNs are able to accurately identify objects as hand gestures more than ninety-five percent of the time. Convolutional Neural Networks, for hand gesture recognition is very effective because it is very accurate. CNN -based HGR also reduces the time drivers spend looking away from the road by 30 to 40% compared to touch interfaces. D T frameworks are also very effective because they allow us to simulate things in time. They make sure the system is in sync and this happens quickly in less than 10ms. We can test things thoroughly in many different conditions, with DT frameworks. Limitations include the complexity of hybrid deep learning models, reliance on high-quality sensor data, and the challenges of generalizing from non-standardized gesture sets and variations among drivers. Future research should focus on optimizing CNN architectures for embedded systems, developing standardized benchmarks based on DT, and carrying out long-term, real-world studies to understand driver acceptance and habituation effects.