Content area

Abstract

The constant threat of malware makes studying its behavior an ongoing task. Malware identification and clas-sification challenges can be solved better by analyzing software behaviorally rather than using conventional hashcode-based signatures. API sequence represents the behavior of any program when collected during its execution. Considering API sequences gathered while the malware was being executed in controlled conditions, this report addresses the issue of choosing influential APIs for malware. The suggested feature selection method Select API in this research selects key features, i.e., significant APIs, that can better classify malware using TF-IDF API embeddings. Two machine learning models, Random Forest, which ensemble several estimators implicitly, and Support Vector Classifier, a standard non-linear model, are trained and evaluated to validate the importance of the chosen APIs. The proposed API selection methodology, called SelectAPI, has shown promising results. It achieves accuracy, macro-avg precision-score, macro-avg recall-score, and macro-avg F1-score of 0.76, 0.77, 0.76, and 0.76, respectively. This method focuses on selecting influential APIs and has resulted in significantly improved performance on the open-benchmark multiclass dynamic-API-Sequence based malware dataset, MAL-API-2019. These results surpass the previously best-known accuracy value of 0.60 and reported F1-Score of 0.61.

Details

1009240
Business indexing term
Title
Behavioural Analysis of Malware by Selecting Influential API Through TF-IDF API Embeddings
Author
Volume
16
Issue
5
Publication year
2025
Publication date
2025
Publisher
Science and Information (SAI) Organization Limited
Place of publication
West Yorkshire
Country of publication
United Kingdom
ISSN
2158107X
e-ISSN
21565570
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
ProQuest document ID
3222641054
Document URL
https://www.proquest.com/scholarly-journals/behavioural-analysis-malware-selecting/docview/3222641054/se-2?accountid=208611
Copyright
© 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-06-24
Database
ProQuest One Academic