JordanWylie,ZhiyuanTan,AhmedAl-Dubai,JianzhenWang
. 2020, 19(4): 28-41.
Abstract (
)
Download PDF (
)
Knowledge map
Save
Each Android malware instance belongs to a specific family that performs a similar set of actions and shares some key characteristics. Being competent to identify Android malware families is critical to address se- curity challenges from malware and the damaging consequences. Ensemble learning is believed an improvement to solve computational intelligence problems by strategically combining decisions from multiple machine learning models. This paper, thus ,presents a study of the application of ensemble learning and its effectiveness in An- droid malware family identification/ 'classification against other single-model-based identification approaches. To conduct a fair evaluation, a prototype of ensemble learning based Android malware classification system was de- veloped for this work, where W eighted Majority Voting ( WMV ) approach is used in this prototype to determine the importance of individual models (i. e. , Support Vector Machine, k-Nearest Neighbour, ExtraTress , Multi- layer Perceptron, and Logistic Regression) to a final decision. The results of the evaluation, conducted by using publicly -accessible malware datasets (i. e.,Drebin and UpDroid) and the recent samples from GitHub reposito- ries, show that the ensemble learning approach does not always perform better than single-model learning ap- proaches. The performance of the ensemble learning based malware family classification is heavily influenced by several factors, in particular the features, the values of the parameters and the weights assigned to individual models.