D³Net: Dual-Branch Disturbance Disentangling Network for Facial Expression Recognition

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

D³Net: Dual-Branch Disturbance Disentangling Network for Facial Expression Recognition

Mo, R; Yan, Y; Xue, J-H; Chen, S; Wang, H; (2021) D³Net: Dual-Branch Disturbance Disentangling Network for Facial Expression Recognition. In: MM '21: Proceedings of the 29th ACM International Conference on Multimedia. (pp. pp. 779-787). ACM: Association for Computing Machinery: New York, NY, USA. Green open access

[thumbnail of RongyunMo-ACMMM2021-final.pdf]

Preview

Text
RongyunMo-ACMMM2021-final.pdf - Accepted Version
Download (971kB) | Preview

Abstract

One of the main challenges in facial expression recognition (FER) is to address the disturbance caused by various disturbing factors, including common ones (such as identity, pose, and illumination) and potential ones (such as hairstyle, accessory, and occlusion). Recently, a number of FER methods have been developed to explicitly or implicitly alleviate the disturbance involved in facial images. However, these methods either consider only a few common disturbing factors or neglect the prior information of these disturbing factors, thus resulting in inferior recognition performance. In this paper, we propose a novel Dual-branch Disturbance Disentangling Network (D3Net), mainly consisting of an expression branch and a disturbance branch, to perform effective FER. In the disturbance branch, a label-aware sub-branch (LAS) and a label-free sub-branch (LFS) are elaborately designed to cope with different types of disturbing factors. On the one hand, LAS explicitly captures the disturbance due to some common disturbing factors by transfer learning on a pretrained model. On the other hand, LFS implicitly encodes the information of potential disturbing factors in an unsupervised manner. In particular, we introduce an Indian buffet process (IBP) prior to model the distribution of potential disturbing factors in LFS. Moreover, we leverage adversarial training to increase the differences between disturbance features and expression features, thereby enhancing the disentanglement of disturbing factors. By disentangling the disturbance from facial images, we are able to extract discriminative expression features. Extensive experiments demonstrate that our proposed method performs favorably against several state-of-the-art FER methods on both in-the-lab and in-the-wild databases.

Type:	Proceedings paper
Title:	D³Net: Dual-Branch Disturbance Disentangling Network for Facial Expression Recognition
Event:	MM '21: ACM Multimedia Conference
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1145/3474085.3475249
Publisher version:	https://doi.org/10.1145/3474085.3475249
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10136962

Downloads since deposit

222Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item