Model validation using mutated training labels: An exploratory study

Jie Zhang, Mark Harman, Benjamin Guedj, Earl Barr

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)
37 Downloads (Pure)

Abstract

We introduce an exploratory study on Mutation Validation (MV), a model validation method using mutated training labels for supervised learning. MV mutates training data labels, retrains the model against the mutated data, and then uses the metamorphic relation that captures the consequent training performance changes to assess model fit. It does not use a validation set or test set. The intuition underpinning MV is that overfitting models tend to fit noise in the training data. MV does not aim to replace out-of-sample validation. Instead, we provide the first exploratory study on the possibility of using MV as a complement of out-of-sample validation. We explore 8 different learning algorithms, 18 datasets, and 5 types of hyperparameter tuning tasks. Our results demonstrate that MV complements well cross-validation and test accuracy in model selection and hyperparameter tuning tasks. MV deserves more attention from developers when simplicity, sustainaiblity, security (e.g., defending training data attack), and interpretability of the built models are required.

Original languageEnglish
Article number126116
JournalNEUROCOMPUTING
Volume539
Issue number6
DOIs
Publication statusPublished - 15 Apr 2023

Fingerprint

Dive into the research topics of 'Model validation using mutated training labels: An exploratory study'. Together they form a unique fingerprint.

Cite this