King's College London

Research portal

Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial Learning

Research output: Contribution to journalArticlepeer-review

Changmeng Zheng, Zhiwei Wu, Tao Wang, Yi Cai, Qing Li

Original languageEnglish
JournalIEEE TRANSACTIONS ON MULTIMEDIA
DOIs
Published3 Aug 2020

King's Authors

Abstract

Named Entity Recognition (NER) in social media posts is challenging since texts are usually short and contexts are lacking. Most recent works show that visual information can boost the NER performance since images can provide complementary contextual information for texts. However, the image-level features ignore the mapping relations between fine-grained visual objects and textual entities, which results in error detection in entities with different types. To better exploit visual and textual information in NER, we propose an adversarial gated bilinear attention neural network (AGBAN). The model jointly extracts entity-related features from both visual objects and texts, and leverages an adversarial training to map two different representations into a shared representation. As a result, domain information contained in an image can be transferred and applied for extracting named entities in the text associated with the image. Experimental results on Tweets dataset demonstrate that our model outperforms the state-of-the-art methods. Moreover, we systematically evaluate the effectiveness of the proposed gated bilinear attention network in capturing the interactions of mutimodal features visual objects and textual words. Our results indicate that the adversarial training can effectively exploit commonalities across heterogeneous data sources, which leads to improved performance in NER when compared to models purely exploiting text data or combining the image-level visual features.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454