King's College London

Research portal

MANet: A Two-stage Deep Learning Method for Classification of COVID-19 from Chest X-ray Images

Research output: Contribution to journalArticlepeer-review

Original languageEnglish
Pages (from-to)96-105
Number of pages10
JournalNEUROCOMPUTING
Volume443
Early online date18 Mar 2021
DOIs
E-pub ahead of print18 Mar 2021
Published5 Jul 2021

Bibliographical note

Funding Information: We would like to thank King's College London for their support and University of Cambridge Research Computing service funded by EPSRC Tier-2 capital grant EP/P020259/1 for provided computational resources. Funding Information: We would like to thank King’s College London for their support and University of Cambridge Research Computing service funded by EPSRC Tier-2 capital grant EP/P020259/1 for provided computational resources. Publisher Copyright: © 2021 Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

Documents

King's Authors

Abstract

The early detection of infection is significant for the fight against the ongoing COVID-19 pandemic. Chest X-ray (CXR) imaging is an efficient screening technique via which lung infections can be detected. This paper aims to distinguish COVID-19 positive cases from the other four classes, including normal, tuberculosis (TB), bacterial pneumonia (BP), and viral pneumonia (VP), using CXR images. The existing COVID-19 classification researches have achieved some successes with deep learning techniques while sometimes lacking interpretability and generalization ability. Hence, we propose a two-stage classification method MANet to address these issues in computer-aided COVID-19 diagnosis. Particularly, a segmentation model predicts the masks for all CXR images to extract their lung regions at the first stage. A followed classification CNN at the second stage then classifies the segmented CXR images into five classes based only on the preserved lung regions. In this segment-based classification task, we propose the mask attention mechanism (MA) which uses the predicted masks at the first stage as spatial attention maps to adjust the features of the CNN at the second stage. The MA spatial attention maps for features calculate the percentage of masked pixels in their receptive fields, suppressing the feature values based on the overlapping rates between their receptive fields and the segmented lung regions. In evaluation, we segment out the lung regions of all CXR images through a UNet with ResNet backbone, and then perform classification on the segmented CXR images using four classic CNNs with or without MA, including ResNet34, ResNet50, VGG16, and Inceptionv3. The experimental results illustrate that the classification models with MA have higher classification accuracy, more stable training process, and better interpretability and generalization ability than those without MA. Among the evaluated classification models, ResNet50 with MA achieves the highest average test accuracy of 96.32 in three runs, and the highest one is 97.06. Meanwhile, the attention heat maps visualized by Grad-CAM indicate that models with MA make more reliable predictions based on the pathological patterns in lung regions. This further presents the potential of MANet to provide clinicians with diagnosis assistance.

Download statistics

No data available

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454