SOTAVerified
Home/All Tasks

All Tasks

4,818 tasks

TaskAreaPapersResults
Click-Through Rate Prediction

Click-through rate prediction is the task of predicting the …

Recommendation & Retrieval391127
3D Object Tracking

3D Object Tracking is a computer vision task dedicated to mo…

Computer Vision67127
Video Prediction

Script for Amee Marketing & Trading Company Short Video (Dur…

Generative Models394126
Zero-Shot Video Retrieval

Zero-shot video retrieval is the task of retrieving relevant…

Multimodal & Vision-Language40126
Temporal Action Localization

Temporal Action Localization aims to detect activities in th…

Time Series & Forecasting1,477125
Open Vocabulary Semantic SegmentationComputer Vision113124
Motion Synthesis

Creating a video where people in the images move (such as bl…

Generative Models282123
Metric Learning

The goal of Metric Learning is to learn a representation fun…

Foundations & Efficiency1,648122
Speech Enhancement

Speech Enhancement is a signal processing task that involves…

Audio & Speech982122
Data-to-Text Generation

A classic problem in natural-language generation (NLG) invol…

Language & Reasoning219122
Action Segmentation

Action Segmentation is a challenging problem in high-level v…

Computer Vision219120
Image Dehazing

( Image credit: [Densely Connected Pyramid Dehazing Network]…

Generative Models295117
Continuous Control

Continuous control in the context of playing games, especial…

Reinforcement Learning & Robotics1,161116
Incremental Learning

Incremental learning aims to develop artificially intelligen…

Foundations & Efficiency1,371112
Cross-Modal Retrieval

Cross-Modal Retrieval (CMR) is a task of retrieving items ac…

Multimodal & Vision-Language522111
Few-Shot Object Detection

Few-Shot Object Detection is a computer vision task that inv…

Computer Vision179111
Image Denoising

Image Denoising is a computer vision task that involves remo…

Generative Models1,220110
Slot Filling

The goal of Slot Filling is to identify from a running dialo…

Language & Reasoning458110
Semi-Supervised Object Detection

Semi-supervised object detection uses both labeled data and …

Computer Vision115110
Open-Domain Question Answering

Open-domain question answering is the task of question answe…

Language & Reasoning494109
Age Estimation

Age Estimation is the task of estimating the age of a person…

Computer Vision254109
Open Information Extraction

In natural language processing, open information extraction …

Language & Reasoning207108
Weakly Supervised Action Localization

In this task, the training data consists of videos with a li…

Computer Vision55107
Visual Dialog

Visual Dialog requires an AI agent to hold a meaningful dial…

Multimodal & Vision-Language118106
Image Manipulation Detection

The task of detecting images or image parts that have been t…

Generative Models73106
Novel View Synthesis

Synthesize a target image with an arbitrary target camera po…

Generative Models1,441103
Human-Object Interaction Detection

Human-Object Interaction (HOI) detection is a task of identi…

Computer Vision449103
Zero-Shot Action RecognitionComputer Vision83103
3D Hand Pose Estimation

Image: [Zimmerman et l](https://arxiv.xsrg/pdf/1705.01389v3.…

Computer Vision178102
Sign Language Recognition

Sign Language Recognition is a computer vision and natural l…

Multimodal & Vision-Language297101
SMAC

Bechmarks for Efficient Exploration of Completion of Multi-s…

Reinforcement Learning & Robotics121101
Abstractive Text Summarization

Abstractive Text Summarization is the task of generating a s…

Language & Reasoning84699
Entity Alignment

Entity Alignment is the task of finding entities in two know…

Language & Reasoning19097
Unsupervised Semantic Segmentation

Models that learn to segment each image (i.e. assign a class…

Computer Vision9597
Drug Discovery

Drug discovery is the task of applying machine learning to d…

Medical & Scientific1,33796
Video Object Segmentation

Video object segmentation is a binary labeling problem aimin…

Computer Vision55196
Zero-Shot Transfer Image ClassificationComputer Vision1995
3D Face Reconstruction

3D Face Reconstruction is a computer vision task that involv…

Generative Models21194
Multi-Person Pose Estimation

Multi-person pose estimation is the task of estimating the p…

Computer Vision15194
Lipreading

Lipreading is a process of extracting speech by watching lip…

Multimodal & Vision-Language10394
Pedestrian Detection

Pedestrian detection is the task of detecting pedestrians fr…

Computer Vision43892
Reading Comprehension

Most current question answering datasets frame the task as r…

Language & Reasoning1,76091
Knowledge Distillation

Knowledge distillation is the process of transferring knowle…

Foundations & Efficiency4,24090
Conversational Response Selection

Conversational response selection refers to the task of iden…

Language & Reasoning4690
Sentence CompletionLanguage & Reasoning9189
Face Recognition

Facial Recognition is the task of making a positive identifi…

Computer Vision2,32988
Image Manipulation Localization

The task of segmenting parts of images or image parts that h…

Generative Models3187
Video Captioning

Video Captioning is a task of automatic captioning a video b…

Multimodal & Vision-Language47386
Text-To-SQL

Text-to-SQL is a task in natural language processing (NLP) w…

Language & Reasoning42486
Zero-Shot Learning

Zero-shot learning (ZSL) is a model's ability to detect clas…

Foundations & Efficiency1,86484