TY - CHAP
T1 - Enabling Fair ML Evaluations for Security
AU - Pendlebury, Feargus
AU - Pierazzi, Fabio
AU - Jordaney, Roberto
AU - Kinder, Johannes
AU - Cavallaro, Lorenzo
N1 - ACM Conference on Computer and Communications Security, CCS '18 ; Conference date: 15-10-2018 Through 19-10-2018
PY - 2018/10/15
Y1 - 2018/10/15
N2 - Machine learning is widely used in security research to classify malicious activity, ranging from malware to malicious URLs and network traffic. However, published performance numbers often seem to leave little room for improvement and, due to a wide range of datasets and configurations, cannot be used to directly compare alternative approaches; moreover, most evaluations have been found to suffer from experimental bias which positively inflates results. In this manuscript we discuss the implementation of Tesseract, an open-source tool to evaluate the performance of machine learning classifiers in a security setting mimicking a deployment with typical data feeds over an extended period of time. In particular, Tesseract allows for a fair comparison of different classifiers in a realistic scenario, without disadvantaging any given classifier. Tesseract is available as open-source to provide the academic community with a way to report sound and comparable performance results, but also to help practitioners decide which system to deploy under specific budget constraints.
AB - Machine learning is widely used in security research to classify malicious activity, ranging from malware to malicious URLs and network traffic. However, published performance numbers often seem to leave little room for improvement and, due to a wide range of datasets and configurations, cannot be used to directly compare alternative approaches; moreover, most evaluations have been found to suffer from experimental bias which positively inflates results. In this manuscript we discuss the implementation of Tesseract, an open-source tool to evaluate the performance of machine learning classifiers in a security setting mimicking a deployment with typical data feeds over an extended period of time. In particular, Tesseract allows for a fair comparison of different classifiers in a realistic scenario, without disadvantaging any given classifier. Tesseract is available as open-source to provide the academic community with a way to report sound and comparable performance results, but also to help practitioners decide which system to deploy under specific budget constraints.
KW - Malware, Machine Learning, Experimental Bias
U2 - 10.1145/3243734.3278505
DO - 10.1145/3243734.3278505
M3 - Conference paper
T3 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
SP - 2264
EP - 2266
BT - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
ER -