SOTAVerified

Weakly Supervised Text-Based Person Re-Identification

2021-01-01ICCV 2021Code Available0· sign in to hype

Shizhen Zhao, Changxin Gao, Yuanjie Shao, Wei-Shi Zheng, Nong Sang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

The conventional text-based person re-identification methods heavily rely on identity annotations. However, this labeling process is costly and time-consuming. In this paper, we consider a more practical setting called weakly supervised text-based person re-identification, where only the text-image pairs are available without the requirement of annotating identities during the training phase. To this end, we propose a Cross-Modal Mutual Training (CMMT) framework. Specifically, to alleviate the intra-class variations, a clustering method is utilized to generate pseudo labels for both visual and textual instances. To further refine the clustering results, CMMT provides a Mutual Pseudo Label Refinement module, which leverages the clustering results in one modality to refine that in the other modality constrained by the text-image pairwise relationship. Meanwhile, CMMT introduces a Text-IoU Guided Cross-Modal Projection Matching loss to resolve the cross-modal matching ambiguity problem. A Text-IoU Guided Hard Sample Mining method is also proposed for learning discriminative textual-visual joint embeddings. We conduct extensive experiments to demonstrate the effectiveness of the proposed CMMT, and the results show that CMMT performs favorably against existing text-based person re-identification methods.

Tasks

Reproductions