SOTAVerified
Home/All Tasks

All Tasks

4,818 tasks

TaskAreaPapersResults
Video Summarization

Video Summarization aims to generate a short synopsis that s…

Multimodal & Vision-Language28018
Medical Image Registration

Image registration, also known as image fusion or image matc…

Medical & Scientific19818
Sentence Ordering

Sentence ordering task deals with finding the correct order …

Language & Reasoning5018
Crop ClassificationFoundations & Efficiency4818
inverse tone mapping

For stack based inverse tone mapping

Generative Models3518
Music Modeling

( Image credit: [R-Transformer](https://arxiv.org/pdf/1907.0…

Audio & Speech3418
Natural Language Visual GroundingMultimodal & Vision-Language3218
Image MatchingComputer Vision1818
Human Instance Segmentation

Instance segmentation is the task of detecting and delineati…

Computer Vision1618
Video-based Generative Performance Benchmarking (Correctness of Information)

The benchmark evaluates a generative Video Conversational Mo…

Generative Models1518
Arabic Text Diacritization

Addition of diacritics for undiacritized arabic texts for wo…

Language & Reasoning1318
Human action generation

Yan et al. (2019) CSGN: "When the dancer is stepping, jumpin…

Generative Models1318
Colorectal Gland Segmentation:Medical & Scientific918
Meter ReadingComputer Vision918
Text-based Person Retrieval with Noisy Correspondence

This is a benchmark about text-based person retrieval with n…

Multimodal & Vision-Language618
Atomic number classification

Predict the atomic number of a node in a molecular/material/…

Foundations & Efficiency118
Contrastive Learning

Contrastive Learning is a deep learning technique for unsupe…

Foundations & Efficiency6,66117
Information Retrieval

Information retrieval is the task of ranking a list of docum…

Recommendation & Retrieval4,74017
Image Restoration

Image Restoration is a family of inverse problems for obtain…

Generative Models1,45917
Community Detection

Community Detection is one of the fundamental problems in ne…

Graphs & Structured Data91917
Knowledge Graph Completion

Knowledge graphs $G$ are represented as a collection of trip…

Graphs & Structured Data48217
Scene Generation

make to t shirt an Ad with a little bit of action

Generative Models30917
Change Point Detection

Change Point Detection is concerned with the accurate detect…

Computer Vision28517
Keyphrase Extraction

A classic task to extract salient phrases that best summariz…

Language & Reasoning15317
Zero Shot SegmentationComputer Vision13417
Point Cloud GenerationGenerative Models11717
Speech-to-Speech Translation

Speech-to-speech translation (S2ST) consists on translating …

Audio & Speech11717
Sketch-Based Image RetrievalMultimodal & Vision-Language11017
Multimodal Machine Translation

Multimodal machine translation is the task of doing machine …

Multimodal & Vision-Language10817
Stereo Depth EstimationComputer Vision9717
Single-Source Domain Generalization

In this task a model is trained in a single source domain an…

Computer Vision4817
Nested Mention Recognition

Nested mention recognition is the task of correctly modeling…

Computer Vision1117
Fine-Grained Urban Flow Inference

Fine-grained urban flow inference (FUFI) aims to infer the f…

Time Series & Forecasting517
Conversational Web Navigation

The problem of conversational web navigation is described as…

Reinforcement Learning & Robotics317
Multi-Task Learning

Multi-task learning aims to learn multiple different tasks s…

Foundations & Efficiency3,68716
Human Activity Recognition

Classify various human activities

Computer Vision74416
Sentence ClassificationLanguage & Reasoning30316
Pose Tracking

Pose Tracking is the task of estimating multi-person human p…

Computer Vision19116
Mortality Prediction

( Image credit: [Early hospital mortality prediction using v…

Medical & Scientific18916
Physical SimulationsMedical & Scientific10016
Anomaly Classification

Anomaly Classification is the task of identifying and catego…

Foundations & Efficiency7216
Surgical phase recognition

The first 40 videos are used for training, the last 40 video…

Medical & Scientific6916
Temporal Sentence Grounding

Temporal sentence grounding (TSG) aims to locate a specific …

Time Series & Forecasting4316
Distance regression

Prediction of the distance between connected nodes in molecu…

Foundations & Efficiency1916
Graph RankingGraphs & Structured Data1816
Breast Tumour ClassificationMedical & Scientific1316
Generalized Few-Shot LearningMedical & Scientific1316
Factual Inconsistency Detection in Chart Captioning

Detect factual inconsistency between charts and captions.

Computer Vision416
regressionFoundations & Efficiency9,42415
Denoising

Denoising is a task in image processing and computer vision …

Generative Models7,28215