| Active Learning Active Learning is a paradigm in supervised machine learning… | Foundations & Efficiency | 3,073 | 7 |
| Intrusion Detection Intrusion Detection is the process of dynamically monitoring… | Computer Vision | 800 | 7 |
| Language Identification Language identification is the task of determining the langu… | Language & Reasoning | 794 | 7 |
| Trajectory Planning Trajectory planning for industrial robots consists of moving… | Reinforcement Learning & Robotics | 324 | 7 |
| Camera Pose Estimation Camera pose estimation is a crucial task in computer vision … | Computer Vision | 304 | 7 |
| Inverse Rendering Inverse Rendering is the task of recovering the properties o… | Generative Models | 271 | 7 |
| Human Parsing Human parsing is the task of segmenting a human image into d… | Language & Reasoning | 125 | 7 |
| Unsupervised Image-To-Image Translation Unsupervised image-to-image translation is the task of doing… | Multimodal & Vision-Language | 124 | 7 |
| Person Identification | Computer Vision | 107 | 7 |
| Semi-supervised Anomaly Detection | Time Series & Forecasting | 76 | 7 |
| Sound Event Localization and Detection Given multichannel audio input, a sound event detection and … | Audio & Speech | 65 | 7 |
| Dynamic Reconstruction | Generative Models | 64 | 7 |
| Personalized Image Generation Utilizes single or multiple images that contain the same sub… | Generative Models | 58 | 7 |
| Table Detection Image credit:[Table Detection in the Wild: A Novel Diverse T… | Computer Vision | 58 | 7 |
| Room Layout Estimation | Computer Vision | 44 | 7 |
| Point Cloud Quality Assessment ### Background A large and dense collection of points in thr… | Computer Vision | 40 | 7 |
| Patient Phenotyping | Medical & Scientific | 15 | 7 |
| Sketch-to-Image Translation | Multimodal & Vision-Language | 15 | 7 |
| Affordance Recognition Affordance recognition from Human-Object Interaction | Computer Vision | 11 | 7 |
| Motion Captioning Generating textual description for human motion. | Multimodal & Vision-Language | 11 | 7 |
| 4D Panoptic Segmentation 4D Panoptic Segmentation is a computer vision task that exte… | Computer Vision | 9 | 7 |
| Error Understanding Discover what causes the model’s prediction errors. | Language & Reasoning | 9 | 7 |
| Traffic Accident Detection | Reinforcement Learning & Robotics | 8 | 7 |
| Human Judgment Correlation A task where an algorithm should generate the judgment score… | Computer Vision | 5 | 7 |
| Phrase Ranking This task aims to evaluate the “global” rank list of phrases… | Language & Reasoning | 3 | 7 |
| EditCompletion Given a code snippet that is partially edited, the goal is t… | Language & Reasoning | 1 | 7 |
| Segmentation | Computer Vision | 13,072 | 6 |
| Prediction | Foundations & Efficiency | 8,760 | 6 |
| Dimensionality Reduction Dimensionality reduction is the task of reducing the dimensi… | Foundations & Efficiency | 3,304 | 6 |
| Cross-Lingual Transfer Cross-lingual transfer refers to transfer learning using dat… | Language & Reasoning | 782 | 6 |
| Graph Generation Graph Generation is an important research area with signific… | Graphs & Structured Data | 712 | 6 |
| Compressive Sensing Compressive Sensing is a new signal processing framework for… | Foundations & Efficiency | 597 | 6 |
| Rain Removal | Generative Models | 317 | 6 |
| Text to 3D Task involves generating 3D objects based on the text prompt… | Multimodal & Vision-Language | 314 | 6 |
| Handwriting Recognition Image source: [Handwriting Recognition of Historical Documen… | Computer Vision | 173 | 6 |
| Sentence Compression Sentence Compression is the task of reducing the length of t… | Language & Reasoning | 149 | 6 |
| 3D Shape Reconstruction Image credit: [GSNet: Joint Vehicle Pose and Shape Reconstru… | Generative Models | 139 | 6 |
| Acoustic Scene Classification The goal of acoustic scene classification is to classify a t… | Audio & Speech | 132 | 6 |
| Video Enhancement | Generative Models | 110 | 6 |
| Semantic Retrieval | Recommendation & Retrieval | 86 | 6 |
| Weakly-supervised Temporal Action Localization | Time Series & Forecasting | 76 | 6 |
| Humor Detection Humor detection is the task of identifying comical or amusin… | Language & Reasoning | 64 | 6 |
| Traffic Sign Detection | Reinforcement Learning & Robotics | 56 | 6 |
| Robot Task Planning | Reinforcement Learning & Robotics | 48 | 6 |
| Image Outpainting Predicting the visual context of an image beyond its boundar… | Generative Models | 47 | 6 |
| Multi-modal Classification | Multimodal & Vision-Language | 31 | 6 |
| Blood pressure estimation | Medical & Scientific | 29 | 6 |
| Kidney Function Continuous prediction of urine production in the next 2h as … | Medical & Scientific | 29 | 6 |
| Audio-visual Question Answering | Multimodal & Vision-Language | 27 | 6 |
| Link Sign Prediction | Foundations & Efficiency | 21 | 6 |