Full text

Turn on search term navigation

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

This paper proposes the use of the FASSD-Net model for semantic segmentation of human silhouettes, these silhouettes can later be used in various applications that require specific characteristics of human interaction observed in video sequences for the understanding of human activities or for human identification. These applications are classified as high-level task semantic understanding. Since semantic segmentation is presented as one solution for human silhouette extraction, it is concluded that convolutional neural networks (CNN) have a clear advantage over traditional methods for computer vision, based on their ability to learn the representations of appropriate characteristics for the task of segmentation. In this work, the FASSD-Net model is used as a novel proposal that promises real-time segmentation in high-resolution images exceeding 20 FPS. To evaluate the proposed scheme, we use the Cityscapes database, which consists of sundry scenarios that represent human interaction with its environment (these scenarios show the semantic segmentation of people, difficult to solve, that favors the evaluation of our proposal), To adapt the FASSD-Net model to human silhouette semantic segmentation, the indexes of the 19 classes traditionally proposed for Cityscapes were modified, leaving only two labels: One for the class of interest labeled as person and one for the background. The Cityscapes database includes the category “human” composed for “rider” and “person” classes, in which the rider class contains incomplete human silhouettes due to self-occlusions for the activity or transport used. For this reason, we only train the model using the person class rather than human category. The implementation of the FASSD-Net model with only two classes shows promising results in both a qualitative and quantitative manner for the segmentation of human silhouettes.

Details

Title
FASSD-Net Model for Person Semantic Segmentation
Author
Luis Brandon Garcia-Ortiz 1   VIAFID ORCID Logo  ; Portillo-Portillo, Jose 1 ; Hernandez-Suarez, Aldo 1   VIAFID ORCID Logo  ; Olivares-Mercado, Jesus 1   VIAFID ORCID Logo  ; Sanchez-Perez, Gabriel 1   VIAFID ORCID Logo  ; Toscano-Medina, Karina 1 ; Perez-Meana, Hector 1   VIAFID ORCID Logo  ; Benitez-Garcia, Gibran 2   VIAFID ORCID Logo 

 Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico; [email protected] (J.P.-P.); [email protected] (A.H.-S.); [email protected] (J.O.-M.); [email protected] (G.S.-P.); [email protected] (K.T.-M.); [email protected] (H.P.-M.) 
 Department of Informatics, The University of Electro-Communications, Chofu-shi 182-8585, Japan; [email protected] 
First page
1393
Publication year
2021
Publication date
2021
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2544961034
Copyright
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.