Video Retrieval

The objective of video retrieval is as follows: given a text query and a pool of candidate videos, select the video which corresponds to the text query. Typically, the videos are returned as a ranked list of candidates and scored via document retrieval metrics.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–486 of 486 papers

Title	Date	Tasks	Status	Hype
Video retrieval based on deep convolutional neural network	Dec 1, 2017	RetrievalTriplet	—Unverified	0
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze	Sep 30, 2017	Activity RecognitionRetrieval	—Unverified	0
Learning from Video and Text via Large-Scale Discriminative Clustering	Jul 27, 2017	Action RecognitionClustering	CodeCode Available	0
An Improved Video Analysis using Context based Extension of LSH	May 10, 2017	Action RecognitionRetrieval	—Unverified	0
Unified Embedding and Metric Learning for Zero-Exemplar Event Detection	May 5, 2017	Event DetectionMetric Learning	—Unverified	0
Dense-Captioning Events in Videos	May 2, 2017	Dense CaptioningRetrieval	CodeCode Available	1
Efficient Action Detection in Untrimmed Videos via Multi-Task Learning	Dec 22, 2016	Action DetectionAction Localization	—Unverified	0
Binary Subspace Coding for Query-by-Image Video Retrieval	Dec 6, 2016	RetrievalVideo Retrieval	—Unverified	0
Real-time analysis of cataract surgery videos using statistical models	Oct 18, 2016	RetrievalVideo Retrieval	—Unverified	0
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering	Oct 10, 2016	Language ModelingLanguage Modelling	—Unverified	0
Learning Language-Visual Embedding for Movie Understanding with Natural-Language	Sep 26, 2016	Multiple-choiceRetrieval	—Unverified	0
Sharing Hash Codes for Multiple Purposes	Sep 11, 2016	RetrievalVideo Retrieval	—Unverified	0
Learning Joint Representations of Videos and Sentences with Web Image Search	Aug 8, 2016	Image RetrievalNatural Language Queries	—Unverified	0
Large-Scale Query-by-Image Video Retrieval Using Bloom Filters	Jul 12, 2016	RetrievalVideo Retrieval	—Unverified	0
De-Hashing: Server-Side Context-Aware Feature Reconstruction for Mobile Visual Search	Jun 29, 2016	RetrievalVideo Retrieval	—Unverified	0
Strategies for Searching Video Content with Text Queries or Video Examples	Jun 17, 2016	Event DetectionReranking	—Unverified	0
Ego-Surfing: Person Localization in First-Person Videos Using Ego-Motion Signatures	Jun 15, 2016	ClusteringRetrieval	—Unverified	0
Deep Learning Based Semantic Video Indexing and Retrieval	Jan 28, 2016	Deep LearningRetrieval	—Unverified	0
VRFP: On-the-fly Video Retrieval using Web Images and Fast Fisher Vector Products	Dec 10, 2015	Re-RankingRetrieval	—Unverified	0
Semantic Video Entity Linking Based on Visual Content and Metadata	Dec 1, 2015	Entity LinkingMetric Learning	—Unverified	0
Multimodal Skip-gram Using Convolutional Pseudowords	Nov 12, 2015	Object RecognitionRetrieval	—Unverified	0
Circulant temporal encoding for video retrieval and temporal alignment	Jun 8, 2015	RetrievalVideo Retrieval	CodeCode Available	0
Face Video Retrieval With Image Query via Hashing Across Euclidean Space and Riemannian Manifold	Jun 1, 2015	RetrievalVideo Retrieval	—Unverified	0
Bag of Genres for Video Retrieval	May 30, 2015	RetrievalVideo Retrieval	—Unverified	0
Visual Information Retrieval in Endoscopic Video Archives	Apr 29, 2015	Information RetrievalRetrieval	—Unverified	0
Discrete Wavelet Transform and Gradient Difference based approach for text localization in videos	Feb 24, 2015	RetrievalText Detection	—Unverified	0
Advances in Human Action Recognition: A Survey	Jan 23, 2015	Action RecognitionRetrieval	—Unverified	0
A Faster Method for Tracking and Scoring Videos Corresponding to Sentences	Nov 14, 2014	RetrievalSentence	—Unverified	0
Analysis of Gait Pattern to Recognize the Human Activities	Jul 18, 2014	Activity RecognitionHuman Activity Recognition	—Unverified	0
Visual Semantic Search: Retrieving Videos via Complex Textual Queries	Jun 1, 2014	Autonomous DrivingNatural Language Queries	—Unverified	0
KPCA Spatio-temporal trajectory point cloud classifier for recognizing human actions in a CBVR system	Mar 26, 2014	Action RecognitionRetrieval	—Unverified	0
Classroom Video Assessment and Retrieval via Multiple Instance Learning	Mar 25, 2014	Multiple Instance LearningRetrieval	—Unverified	0
System Analysis And Design For Multimedia Retrieval Systems	Dec 31, 2013	RetrievalVideo Retrieval	—Unverified	0
Multimodal Approach for Video Surveillance Indexing and Retrieval	Aug 6, 2013	RetrievalVideo Retrieval	—Unverified	0
Learning Locally-Adaptive Decision Functions for Person Verification	Jun 1, 2013	Face VerificationMetric Learning	—Unverified	0
Two-person interaction detection using body-pose features and multiple instance learning	Jul 16, 2012	Activity RecognitionHuman Activity Recognition	—Unverified	0

Show:10 25 50

← PrevPage 10 of 10Next →

All datasets MSR-VTT-1kA DiDeMo MSR-VTT LSMDC ActivityNet MSVD YouCook2 FIVR-200K VATEX QuerYD SSv2-label retrieval SSv2-template retrieval

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	OmniVec	text-to-video R@10	89.4	—	Unverified
2	CLIP4Clip	text-to-video R@10	81.6	—	Unverified
3	OmniVec (pretrained)	text-to-video R@10	78.6	—	Unverified
4	HunYuan_tvr (huge)	text-to-video R@1	62.9	—	Unverified
5	CLIP-ViP	text-to-video R@1	57.7	—	Unverified
6	PIDRo	text-to-video R@1	55.9	—	Unverified
7	DMAE (ViT-B/16)	text-to-video R@1	55.5	—	Unverified
8	HunYuan_tvr	text-to-video R@1	55	—	Unverified
9	MuLTI	text-to-video R@1	54.7	—	Unverified
10	EERCF	text-to-video R@1	54.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Aurora (ours, r=64)	text-to-video R@5	77.4	—	Unverified
2	InternVideo2-6B	text-to-video R@1	74.2	—	Unverified
3	vid-TLDR (UMT-L)	text-to-video R@1	72.3	—	Unverified
4	VAST	text-to-video R@1	72	—	Unverified
5	COSA	text-to-video R@1	70.5	—	Unverified
6	UMT-L (ViT-L/16)	text-to-video R@1	70.4	—	Unverified
7	GRAM	text-to-video R@1	67.3	—	Unverified
8	VALOR	text-to-video R@1	61.5	—	Unverified
9	TESTA (ViT-B/16)	text-to-video R@1	61.2	—	Unverified
10	VindLU	text-to-video R@1	61.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GRAM	text-to-video R@1	64	—	Unverified
2	VAST	text-to-video R@1	63.9	—	Unverified
3	InternVideo2-6B	text-to-video R@1	62.8	—	Unverified
4	VALOR	text-to-video R@1	59.9	—	Unverified
5	UMT-L (ViT-L/16)	text-to-video R@1	58.8	—	Unverified
6	vid-TLDR (UMT-L)	text-to-video R@1	58.1	—	Unverified
7	COSA	text-to-video R@1	57.9	—	Unverified
8	InternVideo2-6B	text-to-video R@1	55.9	—	Unverified
9	InternVideo	text-to-video R@1	55.2	—	Unverified
10	VLAB	text-to-video R@1	55.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EMCL-Net (Ours)++ LSMDC Rohrbach et al. (2015)	text-to-video R@10	53.7	—	Unverified
2	InternVideo2-6B	text-to-video R@1	46.4	—	Unverified
3	vid-TLDR (UMT-L)	text-to-video R@1	43.1	—	Unverified
4	UMT-L (ViT-L/16)	text-to-video R@1	43	—	Unverified
5	HunYuan_tvr (huge)	text-to-video R@1	40.4	—	Unverified
6	COSA	text-to-video R@1	39.4	—	Unverified
7	mPLUG-2	text-to-video R@1	34.4	—	Unverified
8	VALOR	text-to-video R@1	34.2	—	Unverified
9	InternVideo	text-to-video R@1	34	—	Unverified
10	InternVideo2-6B	text-to-video R@1	33.8	—	Unverified