| SensorLLM: Human-Intuitive Alignment of Multivariate Sensor Data with LLMs for Activity Recognition | Oct 14, 2024 | Activity RecognitionDescriptive | CodeCode Available | 2 |
| MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration | Jan 25, 2024 | Computed Tomography (CT)Image Registration | CodeCode Available | 2 |
| Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction | Sep 1, 2021 | 3D ReconstructionNeural Rendering | CodeCode Available | 2 |
| Human Pose as Compositional Tokens | Mar 21, 2023 | DecoderPose Estimation | CodeCode Available | 2 |
| Dense Distinct Query for End-to-End Object Detection | Mar 22, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Deduplicating Training Data Makes Language Models Better | Jul 14, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree Search | May 5, 2022 | | CodeCode Available | 2 |
| Autonomous GIS: the next-generation AI-powered GIS | May 10, 2023 | Code GenerationInformation Retrieval | CodeCode Available | 2 |
| The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | Jun 2, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data | Jun 6, 2024 | 3D GenerationText to 3D | CodeCode Available | 2 |
| TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation | Sep 19, 2024 | Vision-Language-Action | CodeCode Available | 2 |
| Graph Neural Network Surrogates to leverage Mechanistic Expert Knowledge towards Reliable and Immediate Pandemic Response | Nov 10, 2024 | Decision MakingGraph Neural Network | CodeCode Available | 2 |
| UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis | Mar 20, 2025 | Document Layout AnalysisDocument Summarization | CodeCode Available | 2 |
| LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching | Jun 20, 2023 | Brain Tumor ClassificationContrastive Learning | CodeCode Available | 2 |
| ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning | Jul 15, 2022 | Autonomous DrivingBird's-Eye View Semantic Segmentation | CodeCode Available | 2 |
| PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation | Mar 30, 2023 | 3D Human Pose EstimationClassification | CodeCode Available | 2 |
| SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model | Dec 5, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| Bracketing Image Restoration and Enhancement with High-Low Frequency Decomposition | Apr 21, 2024 | Image Restoration | CodeCode Available | 2 |
| LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation | Dec 28, 2023 | Answer GenerationChatbot | CodeCode Available | 2 |
| Overview of the PromptCBLUE Shared Task in CHIP2023 | Dec 29, 2023 | In-Context Learning | CodeCode Available | 2 |
| DebugBench: Evaluating Debugging Capability of Large Language Models | Jan 9, 2024 | Code Generation | CodeCode Available | 2 |
| SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning | Dec 14, 2022 | Multi-agent Reinforcement Learningreinforcement-learning | CodeCode Available | 2 |
| Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs | Apr 22, 2024 | Misinformation | CodeCode Available | 2 |
| PMFSNet: Polarized Multi-scale Feature Self-attention Network For Lightweight Medical Image Segmentation | Jan 15, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion | Feb 8, 2024 | Computational EfficiencyMultimodal Reasoning | CodeCode Available | 2 |
| VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation | Dec 6, 2023 | Language ModellingNavigate | CodeCode Available | 2 |
| STEVE-1: A Generative Model for Text-to-Behavior in Minecraft | Jun 1, 2023 | Decision MakingImage Generation | CodeCode Available | 2 |
| An Efficient and Mixed Heterogeneous Model for Image Restoration | Apr 15, 2025 | Image RestorationMamba | CodeCode Available | 2 |
| Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena | Jun 11, 2024 | Multiple-choiceSelection bias | CodeCode Available | 2 |
| DreamLIP: Language-Image Pre-training with Long Captions | Mar 25, 2024 | Contrastive LearningImage-text Retrieval | CodeCode Available | 2 |
| ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning | Mar 29, 2024 | Continual LearningContinual Panoptic Segmentation | CodeCode Available | 2 |
| Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development | Feb 18, 2021 | BIG-bench Machine LearningDrug Discovery | CodeCode Available | 2 |
| Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras | Apr 29, 2024 | Multi-Task LearningPrognosis | CodeCode Available | 2 |
| 2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection | Jun 15, 2023 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| TeCH: Text-guided Reconstruction of Lifelike Clothed Humans | Aug 16, 2023 | DescriptiveQuestion Answering | CodeCode Available | 2 |
| BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models | Jun 17, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking | Jan 14, 2025 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| Bottleneck Transformers for Visual Recognition | Jan 27, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution | May 8, 2024 | Image Super-Resolution | CodeCode Available | 2 |
| EHRMamba: Towards Generalizable and Scalable Foundation Models for Electronic Health Records | May 23, 2024 | Mamba | CodeCode Available | 2 |
| Multi-Modal Self-Supervised Learning for Recommendation | Feb 21, 2023 | Contrastive LearningData Augmentation | CodeCode Available | 2 |
| Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration | Jan 23, 2024 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 2 |
| MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding | Jan 1, 2024 | Autonomous DrivingInstruction Following | CodeCode Available | 2 |
| ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation | Jun 14, 2024 | Code Generation | CodeCode Available | 2 |
| MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression | Jun 21, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement | Jul 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting | May 31, 2023 | DecoderScene Text Detection | CodeCode Available | 2 |
| Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion Model | Jul 6, 2024 | Image-to-Image TranslationTranslation | CodeCode Available | 2 |
| PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer | Jul 10, 2024 | DecoderHandwritten Mathmatical Expression Recognition | CodeCode Available | 2 |
| Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection | Mar 4, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |