A Deep Learning Framework for Skeleton-Based Human Recognition Using RGB-D Data
Contributors
G Vinoda Reddy
Keywords
Proceeding
Track
Engineering, Sciences, Mathematics & Computations
License
Copyright (c) 2025 Sustainable Global Societies Initiative

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Abstract
Human Action Recognition (HAR) using 3D skeleton joint data has emerged as a key research area in Human–Computer Interaction (HCI) and visual surveillance applications. However, significant challenges arise due to viewpoint variations, which adversely affect the robustness of existing HAR systems. To address this limitation, a novel Skeleton Joint Descriptor (SJD) is proposed, which effectively captures and compensates for viewpoint changes. The proposed method leverages the stability of torso joints to transform all skeleton joints from the Cartesian coordinate system into a view-invariant coordinate framework. Furthermore, redundant joints are identified and eliminated by assigning weights based on their accumulated motion energy over an action sequence. The effectiveness of the proposed framework is validated through extensive experiments conducted on the benchmark NTU RGB+D dataset, comprising 60 distinct human actions. The proposed method achieves an effective recognition accuracy under the cross-view and cross-subject evaluation protocols, respectively. Comparative analysis with recent state-of-the-art methods clearly demonstrates the superiority of the proposed approach.