King's College London

Research portal

MFNet: Multi-class Few-shot Segmentation Network with Pixel-wise Metric Learning

Research output: Contribution to journalArticlepeer-review

Miao Zhang, Miaojing Shi, Li Li

Original languageEnglish
JournalIEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Accepted/In press14 Jul 2022

Documents

  • MFNet-TCSVT

    final_sub.pdf, 5.73 MB, application/pdf

    Uploaded date:13 Aug 2022

    Version:Accepted author manuscript

King's Authors

Abstract

In visual recognition tasks, few-shot learning requires the ability to learn object categories with few support examples. Its re-popularity in light of the deep learning development is mainly in image classification. This work focuses on few-shot semantic segmentation, which is still a largely unexplored field. A few recent advances are often restricted to single-class few-shot segmentation. In this paper, we first present a novel multi-way (class) encoding and decoding architecture which effectively fuses multi-scale query information and multi-class support information into one query-support embedding. Multi-class segmentation is directly decoded upon this embedding. For better feature fusion, a multi-level attention mechanism is proposed within the architecture, which includes the attention for support feature modulation and attention for multi-scale combination. Last, to enhance the embedding space learning, an
additional pixel-wise metric learning module is introduced with triplet loss formulated on the pixel-level of the query-support embedding. Extensive experiments on standard benchmarks PASCAL-5i and COCO-20i show clear benefits of our method over the state of the art in multi-class few-shot segmentation.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454