SOTAVerified

A Lightweight and Effective Image Tampering Localization Network with Vision Mamba

2025-02-14Code Available1· sign in to hype

Kun Guo, Gang Cao, Zijie Lou, Xianglin Huang, Jiaoyun Liu

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Current image tampering localization methods primarily rely on Convolutional Neural Networks (CNNs) and Transformers. While CNNs suffer from limited local receptive fields, Transformers offer global context modeling at the expense of quadratic computational complexity. Recently, the state space model Mamba has emerged as a competitive alternative, enabling linear-complexity global dependency modeling. Inspired by it, we propose a lightweight and effective FORensic network based on vision MAmba (ForMa) for blind image tampering localization. Firstly, ForMa captures multi-scale global features that achieves efficient global dependency modeling through linear complexity. Then the pixel-wise localization map is generated by a lightweight decoder, which employs a parameter-free pixel shuffle layer for upsampling. Additionally, a noise-assisted decoding strategy is proposed to integrate complementary manipulation traces from tampered images, boosting decoder sensitivity to forgery cues. Experimental results on 10 standard datasets demonstrate that ForMa achieves state-of-the-art generalization ability and robustness, while maintaining the lowest computational complexity. Code is available at https://github.com/multimediaFor/ForMa.

Tasks

Reproductions