| MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition | Apr 26, 2024 | Emotion RecognitionMulti-Label Learning | CodeCode Available | 3 |
| DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders | Dec 22, 2022 | ColorizationDecoder | CodeCode Available | 3 |
| PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints | May 26, 2025 | Deep Learning | CodeCode Available | 3 |
| MACE: Mass Concept Erasure in Diffusion Models | Mar 10, 2024 | Text-to-Image Generation | CodeCode Available | 3 |
| MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding | Feb 22, 2024 | Computational EfficiencyPrediction | CodeCode Available | 3 |
| TopoTune : A Framework for Generalized Combinatorial Complex Neural Networks | Oct 9, 2024 | Graph Neural Network | CodeCode Available | 3 |
| FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations | Nov 16, 2024 | Visual Storytelling | CodeCode Available | 3 |
| DoWhy: An End-to-End Library for Causal Inference | Nov 9, 2020 | Causal Inferencevalid | CodeCode Available | 3 |
| Relative Pose Estimation through Affine Corrections of Monocular Depth Priors | Jan 9, 2025 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 3 |
| DistiLLM: Towards Streamlined Distillation for Large Language Models | Feb 6, 2024 | Instruction FollowingKnowledge Distillation | CodeCode Available | 3 |
| TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos | Apr 24, 2025 | MMEVideo MME | CodeCode Available | 3 |
| R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization | Mar 17, 2025 | | CodeCode Available | 3 |
| Music2Latent: Consistency Autoencoders for Latent Audio Compression | Aug 12, 2024 | Audio CompressionInformation Retrieval | CodeCode Available | 3 |
| Advanced Video Inpainting Using Optical Flow-Guided Efficient Diffusion | Dec 1, 2024 | DenoisingOptical Flow Estimation | CodeCode Available | 3 |
| MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection | Apr 9, 2024 | Anomaly DetectionDecoder | CodeCode Available | 3 |
| A Survey on the Memory Mechanism of Large Language Model based Agents | Apr 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| ACEGEN: Reinforcement learning of generative chemical agents for drug discovery | May 7, 2024 | BenchmarkingDecision Making | CodeCode Available | 3 |
| Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning | Oct 11, 2022 | reinforcement-learningReinforcement Learning | CodeCode Available | 3 |
| RiNALMo: General-Purpose RNA Language Models Can Generalize Well on Structure Prediction Tasks | Feb 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Embodied Understanding of Driving Scenarios | Mar 7, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 3 |
| Personalized Image Generation with Deep Generative Models: A Decade Survey | Feb 18, 2025 | Image GenerationPersonalized Image Generation | CodeCode Available | 3 |
| R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO | May 22, 2025 | Reinforcement Learning (RL) | CodeCode Available | 3 |
| Datasheet for the Pile | Jan 13, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition | Apr 23, 2024 | DecoderDiversity | CodeCode Available | 3 |
| imitation: Clean Imitation Learning Implementations | Nov 22, 2022 | Imitation Learningreinforcement-learning | CodeCode Available | 3 |
| Efficient Video Action Detection with Token Dropout and Context Refinement | Apr 17, 2023 | Action DetectionDecoder | CodeCode Available | 3 |
| Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization | May 23, 2024 | | CodeCode Available | 3 |
| LLM-Pruner: On the Structural Pruning of Large Language Models | May 19, 2023 | Text Generationzero-shot-classification | CodeCode Available | 3 |
| BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model | Sep 20, 2023 | 8kLanguage Modeling | CodeCode Available | 3 |
| HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction | Nov 27, 2024 | 3DGS | CodeCode Available | 3 |
| EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training | May 14, 2024 | Data AugmentationSelf-Supervised Learning | CodeCode Available | 3 |
| White-Box Transformers via Sparse Rate Reduction | Jun 1, 2023 | Representation Learning | CodeCode Available | 3 |
| SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures | Mar 3, 2025 | Crack SegmentationMamba | CodeCode Available | 3 |
| Fine-Tuning Language Models from Human Preferences | Sep 18, 2019 | DescriptiveLanguage Modelling | CodeCode Available | 3 |
| GuardT2I: Defending Text-to-Image Models from Adversarial Prompts | Mar 3, 2024 | Binary ClassificationLanguage Modeling | CodeCode Available | 3 |
| Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model | Jul 24, 2024 | Image InpaintingObject | CodeCode Available | 3 |
| Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation | Mar 4, 2024 | Age And Gender ClassificationAge and Gender Estimation | CodeCode Available | 3 |
| EvoTorch: Scalable Evolutionary Computation in Python | Feb 24, 2023 | GPUreinforcement-learning | CodeCode Available | 3 |
| Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection | Oct 24, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 |
| Are We Done with MMLU? | Jun 6, 2024 | MMLUVirology | CodeCode Available | 3 |
| Does End-to-End Autonomous Driving Really Need Perception Tasks? | Sep 26, 2024 | Autonomous Driving | CodeCode Available | 3 |
| Towards Realistic Scene Generation with LiDAR Diffusion Models | Mar 31, 2024 | 3D geometryImage Generation | CodeCode Available | 3 |
| Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials | Jun 4, 2024 | Federated Learning | CodeCode Available | 3 |
| LightGBM: A Highly Efficient Gradient Boosting Decision Tree | Dec 1, 2017 | | CodeCode Available | 3 |
| Faster Diffusion via Temporal Attention Decomposition | Apr 3, 2024 | | CodeCode Available | 3 |
| A Comprehensive Survey on Composed Image Retrieval | Feb 19, 2025 | AttributeImage Retrieval | CodeCode Available | 3 |
| Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling | Nov 27, 2023 | NeRF | CodeCode Available | 3 |
| AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement | Feb 24, 2025 | | CodeCode Available | 3 |
| RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models | May 23, 2024 | HallucinationSentence | CodeCode Available | 3 |
| TextBox 2.0: A Text Generation Library with Pre-trained Language Models | Dec 26, 2022 | Abstractive Text SummarizationData-to-Text Generation | CodeCode Available | 3 |