| StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models | Sep 4, 2024 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty | Aug 27, 2024 | Autonomous DrivingNeural Rendering | CodeCode Available | 2 | 5 |
| Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation | Sep 2, 2024 | GPU | CodeCode Available | 2 | 5 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 | 5 |
| Uncertainty Modelling and Robust Observer Synthesis using the Koopman Operator | Oct 1, 2024 | | CodeCode Available | 2 | 5 |
| VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation | Dec 3, 2024 | Script GenerationVideo Generation | CodeCode Available | 2 | 5 |
| Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models | Feb 20, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 2 | 5 |
| Autoregressive Action Sequence Learning for Robotic Manipulation | Oct 4, 2024 | ChunkingLanguage Modeling | CodeCode Available | 2 | 5 |
| ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality | Dec 5, 2024 | Image Generation | CodeCode Available | 2 | 5 |
| GNSS/GPS Spoofing and Jamming Identification Using Machine Learning and Deep Learning | Jan 4, 2025 | Deep Learning | CodeCode Available | 2 | 5 |
| Towards Vision-Language Geo-Foundation Model: A Survey | Jun 13, 2024 | Earth ObservationImage Captioning | CodeCode Available | 2 | 5 |
| PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model | Aug 7, 2024 | 3D Human Pose EstimationLong-range modeling | CodeCode Available | 2 | 5 |
| Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and Outlook | Feb 29, 2024 | Deep Learning | CodeCode Available | 2 | 5 |
| DRO: A Python Library for Distributionally Robust Optimization in Machine Learning | May 29, 2025 | | CodeCode Available | 2 | 5 |
| Model-Preserving Adaptive Rounding | May 29, 2025 | modelQuantization | CodeCode Available | 2 | 5 |
| Learning Trajectory-Aware Transformer for Video Super-Resolution | Apr 8, 2022 | Super-ResolutionVideo deraining | CodeCode Available | 2 | 5 |
| A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Jan 18, 2024 | Instance SegmentationInteractive Segmentation | CodeCode Available | 2 | 5 |
| MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers | Nov 26, 2024 | Contrastive LearningImage Restoration | CodeCode Available | 2 | 5 |
| Reasoning to Attend: Try to Understand How <SEG> Token Works | Dec 23, 2024 | Semantic SimilaritySemantic Textual Similarity | CodeCode Available | 2 | 5 |
| PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision | Mar 29, 2022 | 3D Human Pose EstimationHallucination | CodeCode Available | 2 | 5 |
| Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs | Jun 12, 2025 | Diversity | CodeCode Available | 2 | 5 |
| TableBank: A Benchmark Dataset for Table Detection and Recognition | Mar 5, 2019 | Table Detection | CodeCode Available | 2 | 5 |
| No Language Left Behind: Scaling Human-Centered Machine Translation | Jul 11, 2022 | Machine TranslationMixture-of-Experts | CodeCode Available | 2 | 5 |
| EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora | Jun 26, 2025 | Graph ReconstructionRAG | CodeCode Available | 2 | 5 |
| SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development | May 22, 2025 | Bug fixingChatbot | CodeCode Available | 2 | 5 |
| AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing | Jul 27, 2022 | Mixed RealityPose Estimation | CodeCode Available | 2 | 5 |
| FlipAttack: Jailbreak LLMs via Flipping | Oct 2, 2024 | | CodeCode Available | 2 | 5 |
| PEDANTS: Cheap but Effective and Interpretable Answer Equivalence | Feb 17, 2024 | BenchmarkingForm | CodeCode Available | 2 | 5 |
| SchNetPack 2.0: A neural network toolbox for atomistic machine learning | Dec 11, 2022 | | CodeCode Available | 2 | 5 |
| Closed-Form Factorization of Latent Semantics in GANs | Jul 13, 2020 | AttributeForm | CodeCode Available | 2 | 5 |
| Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization | Jun 24, 2024 | Consistent Character GenerationImage Generation | CodeCode Available | 2 | 5 |
| OptiChat: Bridging Optimization Models and Practitioners with Large Language Models | Jan 14, 2025 | Code Generationcounterfactual | CodeCode Available | 2 | 5 |
| CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion | Mar 12, 2024 | Code CompletionSafety Alignment | CodeCode Available | 2 | 5 |
| VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning | Mar 19, 2024 | BenchmarkingImage Captioning | CodeCode Available | 2 | 5 |
| ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling | Mar 5, 2024 | AllLanguage Modeling | CodeCode Available | 2 | 5 |
| Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model | May 23, 2024 | MambaState Space Models | CodeCode Available | 2 | 5 |
| TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning | Jun 12, 2025 | Answer GenerationChunking | CodeCode Available | 2 | 5 |
| Controllable 3D Outdoor Scene Generation via Scene Graphs | Mar 10, 2025 | Autonomous DrivingScene Generation | CodeCode Available | 2 | 5 |
| DDSP: Differentiable Digital Signal Processing | Jan 14, 2020 | Audio GenerationAudio Synthesis | CodeCode Available | 2 | 5 |
| Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms | Jun 9, 2022 | COVID-19 Diagnosis | CodeCode Available | 2 | 5 |
| Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion | May 4, 2023 | Image Generation | CodeCode Available | 2 | 5 |
| RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets | Jun 26, 2024 | RetrosynthesisSingle-step retrosynthesis | CodeCode Available | 2 | 5 |
| Reevaluating Adversarial Examples in Natural Language | Apr 25, 2020 | Sentence | CodeCode Available | 2 | 5 |
| CTR-Driven Advertising Image Generation with Multimodal Large Language Models | Feb 5, 2025 | Image GenerationReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Learning Few-Step Diffusion Models by Trajectory Distribution Matching | Mar 9, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 | 5 |
| T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models | May 5, 2025 | Time SeriesTime Series Generation | CodeCode Available | 2 | 5 |
| RM-R1: Reward Modeling as Reasoning | May 5, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| OBELiX: A Curated Dataset of Crystal Structures and Experimentally Measured Ionic Conductivities for Lithium Solid-State Electrolytes | Feb 20, 2025 | | CodeCode Available | 2 | 5 |
| pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models | Jun 23, 2022 | Knowledge Tracingvalid | CodeCode Available | 2 | 5 |
| Lemur: Harmonizing Natural Language and Code for Language Agents | Oct 10, 2023 | | CodeCode Available | 2 | 5 |