King's College London

Research portal

EC2: Ensemble Clustering and Classification for Predicting Android Malware Families

Research output: Contribution to journalArticlepeer-review

Tanmoy Chakraborty, Fabio Pierazzi, V. S. Subrahmanian

Original languageEnglish
Article number8013726
Pages (from-to)262 - 277
Number of pages16
JournalIEEE Transactions on Dependable and Secure Computing
Issue number2
Early online date10 Aug 2017
Accepted/In press2 Aug 2017
E-pub ahead of print10 Aug 2017
Published1 Mar 2020


King's Authors


As the most widely used mobile platform, Android is also the biggest target for mobile malware. Given the increasing number of Android malware variants, detecting malware families is crucial so that security analysts can identify situations where signatures of a known malware family can be adapted as opposed to manually inspecting behavior of all samples. We present EC2 ( E nsemble C lustering and C lassification), a novel algorithm for discovering Android malware families of varying sizes —ranging from very large to very small families (even if previously unseen). We present a performance comparison of several traditional classification and clustering algorithms for Android malware family identification on DREBIN, the largest public Android malware dataset with labeled families. We use the output of both supervised classifiers and unsupervised clustering to design EC2 . Experimental results on both the DREBIN and the more recent Koodous malware datasets show that EC2 accurately detects both small and large families, outperforming several comparative baselines. Furthermore, we show how to automatically characterize and explain unique behaviors of specific malware families, such as FakeInstaller , MobileTx , Geinimi . In short, EC2 presents an early warning system for emerging new malware families, as well as a robust predictor of the family (when it is not new) to which a new malware sample belongs, and the design of novel strategies for data-driven understanding of malware behaviors.

Download statistics

No data available

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454