Full text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Previous research on 3D skeleton-based human action recognition has frequently relied on a sequence-wise viewpoint normalization process, which adjusts the view directions of all segmented action sequences. This type of approach typically demonstrates robustness against variations in viewpoint found in short-term videos, a characteristic commonly encountered in public datasets. However, our preliminary investigation of complex action sequences, such as discussions or smoking, reveals its limitations in capturing the intricacies of such actions. To address these view-dependency issues, we propose a straightforward, yet effective, sequence-wise augmentation technique. This strategy enhances the robustness of action recognition models, particularly against changes in viewing direction that mainly occur within the horizontal plane (azimuth) by rotating human key points around either the z-axis or the spine vector, effectively creating variations in viewing directions. We scrutinize the robustness of this approach against real-world viewpoint variations through extensive empirical studies on multiple public datasets, including an additional set of custom action sequences. Despite the simplicity of our approach, our experimental results consistently yield improved action recognition accuracies. Compared to the sequence-wise viewpoint normalization method used with advanced deep learning models like Conv1D, LSTM, and Transformer, our approach showed a relative increase in accuracy of 34.42% for the z-axis and 10.86% for the spine vector.

Details

Title
Enhancing Robustness of Viewpoint Changes in 3D Skeleton-Based Human Action Recognition
Author
Park, Jinyoon 1   VIAFID ORCID Logo  ; Kim, Chulwoong 2   VIAFID ORCID Logo  ; Seung-Chan, Kim 3   VIAFID ORCID Logo 

 Machine Learning Systems Lab., Department of Sport Interaction Science, Sungkyunkwan University, Suwon 16419, Republic of Korea; [email protected]; TAIIPA—Taean AI Industry Promotion Agency, Taean 32154, Republic of Korea; [email protected] 
 TAIIPA—Taean AI Industry Promotion Agency, Taean 32154, Republic of Korea; [email protected] 
 Machine Learning Systems Lab., Department of Sport Interaction Science, Sungkyunkwan University, Suwon 16419, Republic of Korea; [email protected] 
First page
3280
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
22277390
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2849041742
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.