King's College London

Research portal

Investigating Labelless Drift Adaptation for Malware Detection

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Zeliang Kan, Feargus Pendlebury, Fabio Pierazzi, Lorenzo Cavallaro

Original languageEnglish
Title of host publicationProceedings of the 14th ACM Workshop on Artificial Intelligence and Security (AISec)
PublisherACM
Accepted/In press15 Sep 2021

Documents

  • deplusplus

    deplusplus.pdf, 2.53 MB, application/pdf

    Uploaded date:30 Sep 2021

    Version:Accepted author manuscript

King's Authors

Abstract

The evolution of malware has long plagued machine learning-based detection systems, as malware authors develop innovative strategies to evade detection and chase prots. This induces concept drift as the test distribution diverges from the training, causing performance decay that requires constant monitoring and adaptation.

In this work, we analyze the adaptation strategy used by DroidEvolver, a state-of-the-art learning system that self-updates using pseudo-labels to avoid the high overhead associated with obtaining a new ground truth. After removing sources of experimental bias present in the original evaluation, we identify a number of aws in the generation and integration of these pseudo-labels, leading to a rapid onset of performance degradation as the model poisons itself. We propose DroidEvolver++, a more robust variant of DroidEvolver, to address these issues and highlight the role of pseudo-labels in addressing concept drift. We test the tolerance of the adaptation strategy versus dierent degrees of pseudo-label noise and propose the adoption of methods to ensure only highquality pseudo-labels are used for updates.

Ultimately, we conclude that the use of pseudo-labeling remains a promising solution to limitations on labeling capacity, but great care must be taken when designing update mechanisms to avoid negative feedback loops and self-poisoning which have catastrophic eects on performance.

Download statistics

No data available

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454