Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection

Amit Deo, Santanu Kumar Dash, Guillermo Suarez-Tangil, Vladimir Vovk, Lorenzo Cavallaro

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

25 Citations (Scopus)
234 Downloads (Pure)


Malware evolves perpetually and relies on increasingly sophisticated
attacks to supersede defense strategies. Datadriven
approaches to malware detection run the risk of becoming
rapidly antiquated. Keeping pace with malware
requires models that are periodically enriched with fresh
knowledge, commonly known as retraining. In this work,
we propose the use of Venn-Abers predictors for assessing
the quality of binary classification tasks as a first step towards
identifying antiquated models. One of the key bene-
fits behind the use of Venn-Abers predictors is that they are
automatically well calibrated and offer probabilistic guidance
on the identification of nonstationary populations of
malware. Our framework is agnostic to the underlying classification
algorithm and can then be used for building better
retraining strategies in the presence of concept drift. Results
obtained over a timeline-based evaluation with about 90K
samples show that our framework can identify when models
tend to become obsolete.
Original languageEnglish
Title of host publicationACM Workshop on Artificial Intelligence and Security (AISec)
Publication statusPublished - 28 Oct 2016


Dive into the research topics of 'Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection'. Together they form a unique fingerprint.

Cite this