| REEF: Representation Encoding Fingerprints for Large Language Models | Oct 18, 2024 | | CodeCode Available | 2 |
| Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation | Mar 20, 2024 | Semantic SegmentationWeakly supervised Semantic Segmentation | CodeCode Available | 2 |
| MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model | Aug 31, 2022 | DenoisingMotion Generation | CodeCode Available | 2 |
| Large language models surpass human experts in predicting neuroscience results | Mar 4, 2024 | | CodeCode Available | 2 |
| Owl-1: Omni World Model for Consistent Long Video Generation | Dec 12, 2024 | Video Generation | CodeCode Available | 2 |
| Diving Deeper Into Pedestrian Behavior Understanding: Intention Estimation, Action Prediction, and Event Risk Assessment | Jun 29, 2024 | Prediction | CodeCode Available | 2 |
| K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GenSim: A General Social Simulation Platform with Large Language Model based Agents | Oct 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Metric Flow Matching for Smooth Interpolations on the Data Manifold | May 23, 2024 | Trajectory Prediction | CodeCode Available | 2 |
| Harmonizer: Learning to Perform White-Box Image and Video Harmonization | Jul 4, 2022 | Image HarmonizationVideo Harmonization | CodeCode Available | 2 |
| Android in the Zoo: Chain-of-Action-Thought for GUI Agents | Mar 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Knowledge Circuits in Pretrained Transformers | May 28, 2024 | In-Context Learningknowledge editing | CodeCode Available | 2 |
| PyMIC: A deep learning toolkit for annotation-efficient medical image segmentation | Aug 19, 2022 | Deep LearningImage Segmentation | CodeCode Available | 2 |
| PHemoNet: A Multimodal Network for Physiological Signals | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| From Sparse to Soft Mixtures of Experts | Aug 2, 2023 | | CodeCode Available | 2 |
| ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text | Jan 2, 2024 | ColorizationSketch Colorization | CodeCode Available | 2 |
| Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset | Jun 10, 2024 | Instance SegmentationSalient Object Detection | CodeCode Available | 2 |
| DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution | Mar 3, 2025 | Autonomous DrivingImage Super-Resolution | CodeCode Available | 2 |
| nuScenes: A multimodal dataset for autonomous driving | Mar 26, 2019 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection | Jun 10, 2024 | Backdoor AttackCode Completion | CodeCode Available | 2 |
| Shape, Light, and Material Decomposition from Images using Monte Carlo Rendering and Denoising | Jun 7, 2022 | 3D ReconstructionDenoising | CodeCode Available | 2 |
| Video Prediction Transformers without Recurrence or Convolution | Oct 7, 2024 | DecoderPrediction | CodeCode Available | 2 |
| TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning | Apr 13, 2025 | Question Answeringreinforcement-learning | CodeCode Available | 2 |
| DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering | Oct 11, 2021 | Speech Enhancement | CodeCode Available | 2 |
| PoseScript: Linking 3D Human Poses and Natural Language | Oct 21, 2022 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 2 |
| SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations | Aug 2, 2021 | DenoisingImage Generation | CodeCode Available | 2 |
| Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift | Jul 10, 2024 | Change DetectionDisaster Response | CodeCode Available | 2 |
| LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating Metaheuristics | May 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Unsupervised Universal Image Segmentation | Dec 28, 2023 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models | May 29, 2025 | Self-Supervised LearningVideo Generation | CodeCode Available | 2 |
| Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration | Dec 20, 2024 | Human Agent Collaboration | CodeCode Available | 2 |
| HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Apr 3, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| MedM-VL: What Makes a Good Medical LVLM? | Apr 6, 2025 | Medical Image AnalysisQuestion Answering | CodeCode Available | 2 |
| Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards | Apr 16, 2024 | GSM8KMath | CodeCode Available | 2 |
| MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark | May 20, 2024 | College MathematicsGSM8K | CodeCode Available | 2 |
| ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models | Oct 11, 2023 | Image Generation | CodeCode Available | 2 |
| All for One and One for All: Improving Music Separation by Bridging Networks | Oct 8, 2020 | AllMusic Source Separation | CodeCode Available | 2 |
| Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration | Sep 22, 2022 | Compressed Image Super-resolutionImage Restoration | CodeCode Available | 2 |
| MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems | Mar 5, 2025 | | CodeCode Available | 2 |
| Mixture of LoRA Experts | Apr 21, 2024 | | CodeCode Available | 2 |
| Neighboring Autoregressive Modeling for Efficient Visual Generation | Mar 12, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| The Calysto Scheme Project | Oct 16, 2023 | | CodeCode Available | 2 |
| ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models | Jul 5, 2024 | HallucinationLong Form Question Answering | CodeCode Available | 2 |
| Exploring Plain Vision Transformer Backbones for Object Detection | Mar 30, 2022 | Cross-Domain Few-Shot Object DetectionInstance Segmentation | CodeCode Available | 2 |
| Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging | Jun 17, 2024 | | CodeCode Available | 2 |
| Hidden Biases of End-to-End Driving Models | Jun 13, 2023 | Autonomous DrivingBench2Drive | CodeCode Available | 2 |
| LaserMix for Semi-Supervised LiDAR Semantic Segmentation | Jun 30, 2022 | LIDAR Semantic SegmentationSegmentation | CodeCode Available | 2 |
| IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization | May 11, 2024 | Sound Source Localization | CodeCode Available | 2 |
| GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling | Jan 31, 2025 | DenoisingGesture Generation | CodeCode Available | 2 |
| Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering | Oct 21, 2024 | Open-Domain Question AnsweringQuestion Answering | CodeCode Available | 2 |