SOTAVerified

Image-text Retrieval

Papers

Showing 7180 of 248 papers

TitleStatusHype
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion LearningCode1
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training ModelCode1
Mr. Right: Multimodal Retrieval on Representation of ImaGe witH TextCode1
FETA: Towards Specializing Foundation Models for Expert Task ApplicationsCode1
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical AlignmentCode1
MixGen: A New Multi-Modal Data AugmentationCode1
Coarse-to-Fine Vision-Language Pre-training with Fusion in the BackboneCode1
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-trainingCode1
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connectionsCode1
CCMB: A Large-scale Chinese Cross-modal BenchmarkCode1
Show:102550
← PrevPage 8 of 25Next →

No leaderboard results yet.