King's College London

Research portal

ICAM: Interpretable Classification via Disentangled Representations and Feature Attribution Mapping

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems
Published6 Dec 2020


King's Authors


Feature attribution (FA), or the assignment of class-relevance to different locations
in an image, is important for many classification problems but is particularly crucial
within the neuroscience domain, where accurate mechanistic models of behaviours,
or disease, require knowledge of all features discriminative of a trait. At the same
time, predicting class relevance from brain images is challenging as phenotypes
are typically heterogeneous, and changes occur against a background of significant
natural variation. Here, we present a novel framework for creating class specific
FA maps through image-to-image translation. We propose the use of a VAE-GAN
to explicitly disentangle class relevance from background features for improved
interpretability properties, which results in meaningful FA maps. We validate our
method on 2D and 3D brain image datasets of dementia (ADNI dataset), ageing
(UK Biobank), and (simulated) lesion detection. We show that FA maps generated
by our method outperform baseline FA methods when validated against ground
truth. More significantly, our approach is the first to use latent space sampling to
support exploration of phenotype variation. Our code will be available online at

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454