Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

2025-04-14Code Available1· sign in to hype

Changwei Wang, Shunpeng Chen, Yukun Song, Rongtao Xu, Zherui Zhang, Jiguang Zhang, Haoran Yang, Yu Zhang, Kexue Fu, Shide Du, Zhiwei Xu, Longxiang Gao, Li Guo, Shibiao Xu

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/chenshunpeng/FoL
OfficialIn paperpytorch★ 37

Abstract

Visual Place Recognition (VPR) is aimed at predicting the location of a query image by referencing a database of geotagged images. For VPR task, often fewer discriminative local regions in an image produce important effects while mundane background regions do not contribute or even cause perceptual aliasing because of easy overlap. However, existing methods lack precisely modeling and full exploitation of these discriminative regions. In this paper, we propose the Focus on Local (FoL) approach to stimulate the performance of image retrieval and re-ranking in VPR simultaneously by mining and exploiting reliable discriminative local regions in images and introducing pseudo-correlation supervision. First, we design two losses, Extraction-Aggregation Spatial Alignment Loss (SAL) and Foreground-Background Contrast Enhancement Loss (CEL), to explicitly model reliable discriminative local regions and use them to guide the generation of global representations and efficient re-ranking. Second, we introduce a weakly-supervised local feature training strategy based on pseudo-correspondences obtained from aggregating global features to alleviate the lack of local correspondences ground truth for the VPR task. Third, we suggest an efficient re-ranking pipeline that is efficiently and precisely based on discriminative region guidance. Finally, experimental results show that our FoL achieves the state-of-the-art on multiple VPR benchmarks in both image retrieval and re-ranking stages and also significantly outperforms existing two-stage VPR methods in terms of computational efficiency. Code and models are available at https://github.com/chenshunpeng/FoL

Tasks

Computational Efficiency Image Retrieval Re-Ranking Retrieval Visual Place Recognition

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
AmsterTime	FoL	Recall@1	70.1	—	Unverified
AmsterTime	FoL-global	Recall@1	64.6	—	Unverified
Eynsham	FoL-global	Recall@1	91.7	—	Unverified
Eynsham	FoL	Recall@1	92.4	—	Unverified
Mapillary test	FoL-global	Recall@1	78.7	—	Unverified
Mapillary test	FoL	Recall@1	80	—	Unverified
Mapillary val	FoL	Recall@1	93.5	—	Unverified
Mapillary val	FoL-global	Recall@1	93.1	—	Unverified
Nordland	FoL-global	Recall@1	87.8	—	Unverified
Nordland	FoL	Recall@1	92.6	—	Unverified
Nordland* (2760 queries)	FoL-global	Recall@1	78.3	—	Unverified
Nordland* (2760 queries)	FoL	Recall@1	85.5	—	Unverified
Pittsburgh-250k-test	FoL	Recall@1	97	—	Unverified
Pittsburgh-250k-test	FoL-global	Recall@1	96.5	—	Unverified
Pittsburgh-30k-test	FoL	Recall@1	94.5	—	Unverified
Pittsburgh-30k-test	FoL-global	Recall@1	93.9	—	Unverified
SF-XL Night	FoL-global	Recall@1	53.4	—	Unverified
SF-XL Night	FoL	Recall@1	60.5	—	Unverified
SF-XL Occlusion	FoL-global	Recall@1	51.3	—	Unverified
SF-XL Occlusion	FoL	Recall@1	61.8	—	Unverified
SPED	FoL	Recall@1	91.8	—	Unverified
SPED	FoL-global	Recall@1	92.1	—	Unverified
St Lucia	FoL	Recall@1	99.9	—	Unverified
St Lucia	FoL-global	Recall@1	99.9	—	Unverified
SVOX	FoL	Recall@1	98.9	—	Unverified
SVOX	FoL-global	Recall@1	98.4	—	Unverified
SVOX Night	FoL	Recall@1	98.8	—	Unverified
SVOX Night	FoL-global	Recall@1	98.3	—	Unverified
SVOX-Overcast	FoL	Recall@1	98.2	—	Unverified
SVOX-Overcast	FoL-global	Recall@1	97.9	—	Unverified
SVOX-Rain	FoL-global	Recall@1	96.5	—	Unverified
SVOX-Rain	FoL	Recall@1	98.2	—	Unverified
SVOX-Snow	FoL-global	Recall@1	99.1	—	Unverified
SVOX-Snow	FoL	Recall@1	99.3	—	Unverified
SVOX Sun	FoL	Recall@1	98.8	—	Unverified
SVOX Sun	FoL- global	Recall@1	98.1	—	Unverified
Tokyo247	FoL-global	Recall@1	96.2	—	Unverified
Tokyo247	FoL	Recall@1	98.4	—	Unverified

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Code

Abstract

Tasks

Benchmark Results

Reproductions