| Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis | Jun 15, 2023 | Image GenerationPreference Mapping | CodeCode Available | 2 |
| GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech | May 15, 2022 | Speech SynthesisStyle Transfer | CodeCode Available | 2 |
| Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark | May 31, 2022 | Autonomous DrivingCamera Pose Estimation | CodeCode Available | 2 |
| Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule | May 12, 2025 | Drug DesignScheduling | CodeCode Available | 2 |
| Autonomous Catheterization with Open-source Simulator and Expert Trajectory | Jan 17, 2024 | | CodeCode Available | 2 |
| Data-Centric Foundation Models in Computational Healthcare: A Survey | Jan 4, 2024 | EthicsSurvey | CodeCode Available | 2 |
| BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning | Aug 14, 2024 | Backdoor AttackPrompt Learning | CodeCode Available | 2 |
| LRM-Zero: Training Large Reconstruction Models with Synthesized Data | Jun 13, 2024 | 3D Reconstruction | CodeCode Available | 2 |
| DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models | Oct 9, 2023 | | CodeCode Available | 2 |
| Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance | Apr 7, 2025 | | CodeCode Available | 2 |
| UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement | Apr 22, 2024 | 4kImage Enhancement | CodeCode Available | 2 |
| TRACE: Temporal Grounding Video LLM via Causal Event Modeling | Oct 8, 2024 | Text GenerationVideo Understanding | CodeCode Available | 2 |
| DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models | Jul 1, 2024 | DenoisingImage Restoration | CodeCode Available | 2 |
| ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention | Jan 1, 2024 | Blocking | CodeCode Available | 2 |
| A Survey on Large Language Models for Code Generation | Jun 1, 2024 | Code GenerationHumanEval | CodeCode Available | 2 |
| Rethinking Optimization and Architecture for Tiny Language Models | Feb 5, 2024 | Language Modelling | CodeCode Available | 2 |
| FusionMamba: Efficient Remote Sensing Image Fusion with State Space Model | Apr 11, 2024 | Mamba | CodeCode Available | 2 |
| Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio | Jun 28, 2023 | Language ModellingText Generation | CodeCode Available | 2 |
| VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models | May 27, 2024 | Object | CodeCode Available | 2 |
| THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models | Mar 31, 2025 | GPU | CodeCode Available | 2 |
| BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image Restoration | May 27, 2025 | AllImage Restoration | CodeCode Available | 2 |
| DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models | Mar 4, 2025 | DiversityGPU | CodeCode Available | 2 |
| GAUCHE: A Library for Gaussian Processes in Chemistry | Dec 6, 2022 | Bayesian OptimisationGaussian Processes | CodeCode Available | 2 |
| NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023 | Jul 5, 2023 | Action LocalizationMoment Queries | CodeCode Available | 2 |
| MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries | May 2, 2022 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | Mar 14, 2024 | 3D Face AnimationDiversity | CodeCode Available | 2 |
| DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials | May 28, 2025 | Drug Discoverygraph partitioning | CodeCode Available | 2 |
| CausalPFN: Amortized Causal Effect Estimation via In-Context Learning | Jun 9, 2025 | Decision MakingHeterogeneous Treatment Effect Estimation | CodeCode Available | 2 |
| Automated Capability Discovery via Model Self-Exploration | Feb 11, 2025 | model | CodeCode Available | 2 |
| DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images | Apr 4, 2024 | Denoising | CodeCode Available | 2 |
| HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation | Jul 3, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Accessing Vision Foundation Models at ImageNet-level Costs | Jul 15, 2024 | Knowledge DistillationTransfer Learning | CodeCode Available | 2 |
| SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support | Apr 30, 2023 | Chatbot | CodeCode Available | 2 |
| Efficient Reasoning with Hidden Thinking | Jan 31, 2025 | DecoderMultimodal Reasoning | CodeCode Available | 2 |
| Machine Unlearning: Solutions and Challenges | Aug 14, 2023 | Machine Unlearning | CodeCode Available | 2 |
| Text2Human: Text-Driven Controllable Human Image Generation | May 31, 2022 | DiversityHuman Parsing | CodeCode Available | 2 |
| SIFT: Grounding LLM Reasoning in Contexts via Stickers | Feb 19, 2025 | GSM8KMath | CodeCode Available | 2 |
| Measuring Mathematical Problem Solving With the MATH Dataset | Mar 5, 2021 | MathMathematical Problem-Solving | CodeCode Available | 2 |
| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 |
| Towards Learning a Generalist Model for Embodied Navigation | Dec 4, 2023 | 3D Question Answering (3D-QA)Embodied Question Answering | CodeCode Available | 2 |
| RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing | Jun 20, 2023 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 2 |
| REST: Retrieval-Based Speculative Decoding | Nov 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning | Feb 18, 2025 | Math | CodeCode Available | 2 |
| Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models | May 16, 2024 | Music Generation | CodeCode Available | 2 |
| OpenStreetView-5M: The Many Roads to Global Visual Geolocation | Apr 29, 2024 | Photo geolocation estimation | CodeCode Available | 2 |
| A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation | Apr 25, 2024 | Autonomous DrivingEvolutionary Algorithms | CodeCode Available | 2 |
| Unsupervised Learning for Joint Beamforming Design in RIS-aided ISAC Systems | Mar 26, 2024 | Integrated sensing and communicationISAC | CodeCode Available | 2 |
| Feedback Efficient Online Fine-Tuning of Diffusion Models | Feb 26, 2024 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| Physics-informed active learning for accelerating quantum chemical simulations | Apr 18, 2024 | Active LearningUncertainty Quantification | CodeCode Available | 2 |
| Enabling Large Language Models to Generate Text with Citations | May 24, 2023 | HallucinationRetrieval | CodeCode Available | 2 |