| Task | Area | Papers | Results |
|---|---|---|---|
| Document Summarization Automatic Document Summarization is the task of rewriting a … | Language & Reasoning | 760 | 50 |
| Unsupervised Anomaly Detection The objective of Unsupervised Anomaly Detection is to detect… | Time Series & Forecasting | 506 | 50 |
| Multi-Label Text Classification According to Wikipedia "In machine learning, multi-label cla… | Language & Reasoning | 171 | 50 |
| Text Spotting Scene Text Spotting is the combination of Scene Text Detecti… | Language & Reasoning | 112 | 50 |
| Medical Code Prediction Context: Prediction of medical codes from clinical notes is … | Medical & Scientific | 27 | 50 |
| Speaker Verification Speaker verification is the verifying the identity of a pers… | Audio & Speech | 746 | 49 |
| Semantic correspondence The task of semantic correspondence aims to establish reliab… | Computer Vision | 175 | 49 |
| Document Image Classification Document image classification is the task of classifying doc… | Multimodal & Vision-Language | 50 | 49 |
| Part-Of-Speech Tagging Part-of-speech tagging (POS tagging) is the task of tagging … | Language & Reasoning | 990 | 48 |
| Image Inpainting Remove the preview tag from template | Generative Models | 708 | 48 |
| Hate Speech Detection Hate speech detection is the task of detecting if communicat… | Language & Reasoning | 507 | 48 |
| Speech Emotion Recognition Speech Emotion Recognition is a task of speech processing an… | Audio & Speech | 431 | 48 |
| 2D Human Pose Estimation What is Human Pose Estimation? Human pose estimation is the … | Computer Vision | 118 | 47 |
| Salient Object Detection | Computer Vision | 527 | 46 |
| Handwritten Text Recognition Handwritten Text Recognition (HTR) is the task of automatica… | Language & Reasoning | 139 | 46 |
| Fraud Detection Fraud Detection is a vital topic that applies to many indust… | Computer Vision | 547 | 45 |
| Robot Navigation The fundamental objective of mobile Robot Navigation is to a… | Reinforcement Learning & Robotics | 542 | 45 |
| OpenAI Gym An open-source toolkit from OpenAI that implements several R… | Reinforcement Learning & Robotics | 382 | 45 |
| Image Matting Image Matting is the process of accurately estimating the fo… | Generative Models | 225 | 45 |
| Action Recognition In Videos Action Recognition in Videos is a task in computer vision an… | Computer Vision | 124 | 45 |
| Action Quality Assessment Assessing/analyzing/quantifying how well an action was perfo… | Computer Vision | 52 | 45 |
| Full reference image quality assessment The goal is to calculate an objective quality score for a gi… | Computer Vision | 50 | 45 |
| 3D Reconstruction 3D Reconstruction is the task of creating a 3D model or repr… | Computer Vision | 2,326 | 44 |
| Object Tracking Object tracking is the task of taking an initial set of obje… | Computer Vision | 1,966 | 44 |
| Paraphrase Identification The goal of Paraphrase Identification is to determine whethe… | Language & Reasoning | 172 | 44 |
| Multivariate Time Series Forecasting | Time Series & Forecasting | 245 | 43 |
| Music Source Separation Music source separation is the task of decomposing music int… | Audio & Speech | 107 | 43 |
| Natural Language Understanding Natural Language Understanding is an important field of Natu… | Language & Reasoning | 1,978 | 42 |
| Question Generation The goal of Question Generation is to generate a valid and f… | Language & Reasoning | 664 | 42 |
| 2D Semantic Segmentation | Computer Vision | 79 | 42 |
| Zero-Shot Semantic Segmentation | Computer Vision | 60 | 42 |
| Text based Person Retrieval | Multimodal & Vision-Language | 49 | 42 |
| Mathematical Reasoning | Language & Reasoning | 805 | 41 |
| Fact Verification Fact verification, also called "fact checking", is a process… | Language & Reasoning | 216 | 41 |
| Cross-Lingual Document Classification Cross-lingual document classification refers to the task of … | Language & Reasoning | 25 | 41 |
| Semantic Role Labeling Semantic role labeling aims to model the predicate-argument … | Language & Reasoning | 620 | 40 |
| Speaker Diarization Speaker Diarization is the task of segmenting and co-indexin… | Audio & Speech | 328 | 40 |
| Visual Navigation Visual Navigation is the problem of navigating an agent, e.g… | Computer Vision | 316 | 40 |
| Constituency Parsing Constituency parsing aims to extract a constituency-based pa… | Language & Reasoning | 204 | 40 |
| Visual Prompt Tuning Visual Prompt Tuning(VPT) only introduces a small amount of … | Multimodal & Vision-Language | 70 | 40 |
| Action Anticipation Next action anticipation is defined as observing 1, ... , T … | Computer Vision | 110 | 39 |
| Long-Context Understanding | Language & Reasoning | 81 | 39 |
| Blind Face Restoration Blind face restoration aims at recovering high-quality faces… | Generative Models | 55 | 39 |
| Molecule Captioning Molecular description generation entails the creation of a d… | Multimodal & Vision-Language | 25 | 39 |
| Facial Landmark Detection Facial Landmark Detection is a computer vision task that inv… | Computer Vision | 139 | 38 |
| Chart Question Answering Question Answering task on charts images | Multimodal & Vision-Language | 50 | 38 |
| Animal Pose Estimation Animal pose estimation is the task of identifying the pose o… | Computer Vision | 48 | 38 |
| Multispectral Object Detection Only using RGB cameras for automatic outdoor scene analysis … | Computer Vision | 39 | 38 |
| Quantization Quantization is a promising technique to reduce the computat… | Foundations & Efficiency | 4,925 | 37 |
| Video Inpainting The goal of Video Inpainting is to fill in missing regions o… | Generative Models | 130 | 37 |