SOTAVerified
Home/All Tasks

All Tasks

4,818 tasks

TaskAreaPapersResults
Class Incremental LearningFoundations & Efficiency63415
MRI Reconstruction

In its most basic form, MRI reconstruction consists in retri…

Medical & Scientific44115
Saliency Detection

Saliency Detection is a preprocessing step in computer visio…

Computer Vision36415
Scene Segmentation

Scene segmentation is the task of splitting a scene into its…

Computer Vision28315
Speaker IdentificationAudio & Speech24815
Explanation GenerationLanguage & Reasoning23515
COVID-19 Diagnosis

Covid-19 Diagnosis is the task of diagnosing the presence of…

Medical & Scientific21115
Keyword Extraction

Keyword extraction is tasked with the automatic identificati…

Language & Reasoning17215
Point Tracking

Point Tracking, often referred to as Tracking any Point (TAP…

Computer Vision15115
Code Search

The goal of Code Search is to retrieve code fragments from a…

Language & Reasoning12515
Talking Head Generation

Talking head generation is the task of generating a talking …

Generative Models11915
Protein Function Prediction

For GO terms prediction, given the specific function predict…

Medical & Scientific7915
Weakly-supervised instance segmentationComputer Vision3915
Table-based Fact Verification

Verifying facts given semi-structured data.

Language & Reasoning2615
Beat Tracking

Determine the positions of all beats in a music recording.

Audio & Speech1915
Dense Pixel Correspondence EstimationComputer Vision1715
Ad-hoc video search

The Ad-hoc search task ended a 3 year cycle from 2016-2018 w…

Multimodal & Vision-Language1315
Image Retrieval with Multi-Modal Query

The problem of retrieving images from a database based on a …

Multimodal & Vision-Language1015
GPS Embeddings

GPS Embeddings is the collective name for a set of feature-l…

Foundations & Efficiency115
Time Series Analysis

Time Series Analysis is a statistical technique used to anal…

Time Series & Forecasting6,74814
Binary ClassificationFoundations & Efficiency2,57414
Model Compression

Model Compression is an actively pursued area of research ov…

Foundations & Efficiency1,35614
3D Pose Estimation

Image credit: [GSNet: Joint Vehicle Pose and Shape Reconstru…

Computer Vision37914
Task-Oriented Dialogue Systems

Achieving a pre-defined task through a dialog.

Language & Reasoning30814
3D Object Classification

3D Object Classification is the task of predicting the class…

Computer Vision9314
Multi-target Domain Adaptation

The idea of Multi-target Domain Adaptation is to adapt a mod…

Foundations & Efficiency3914
Interpretability Techniques for Deep LearningFoundations & Efficiency2514
Audio Super-Resolution

Audio super-resolution, especially speech, refers to the pro…

Audio & Speech2214
3D Multi-Person Pose Estimation (absolute)

This task aims to solve absolute 3D multi-person pose Estima…

Computer Vision2014
Font Recognition

Font recognition (also called visual font recognition or opt…

Computer Vision1914
Semi-Supervised Instance SegmentationComputer Vision1914
MMR total

Sum of all scores of the 11 distinct tasks involving texts, …

Foundations & Efficiency1214
Story Continuation

The task involves providing an initial scene that can be obt…

Language & Reasoning1014
Image/Document ClusteringMultimodal & Vision-Language814
Horizon Line EstimationComputer Vision714
Core set discovery

A core set in machine learning is defined as the minimal set…

Foundations & Efficiency114
Meta-Learning

Meta-learning is a methodology considered with "learning to …

Foundations & Efficiency3,56913
Outlier Detection

Outlier Detection is a task of identifying a subset of a giv…

Computer Vision70313
Event Extraction

Determine the extent of the events in a text. Other names: E…

Computer Vision44613
Multimodal Reasoning

Reasoning over multimodal inputs.

Multimodal & Vision-Language30213
Saliency Prediction

A saliency map is a model that predicts eye fixations on a v…

Foundations & Efficiency26813
ECG ClassificationMedical & Scientific11613
Aspect Extraction

Aspect extraction is the task of identifying and extracting …

Computer Vision9213
3D Action Recognition

Image: [Rahmani et al](https://www.cv-foundation.org/openacc…

Computer Vision9113
Instrument RecognitionComputer Vision3913
Video derainingGenerative Models3013
Binary text classificationLanguage & Reasoning2013
Grounded Situation Recognition

Grounded Situation Recognition aims to produce the structure…

Computer Vision1513
Situation Recognition

Situation Recognition aims to produce the structured image s…

Computer Vision1213
Downbeat Tracking

Determine the positions of all downbeats in a music recordin…

Audio & Speech1113