| SPD Learning for Covariance-Based Neuroimaging Analysis: Perspectives, Methods, and Challenges | Apr 26, 2025 | | CodeCode Available | 2 |
| X-Ray: A Sequential 3D Representation For Generation | Apr 22, 2024 | 3D GenerationObject | CodeCode Available | 2 |
| BridgeData V2: A Dataset for Robot Learning at Scale | Aug 24, 2023 | Imitation LearningMulti-Task Learning | CodeCode Available | 2 |
| Controlled Text Generation via Language Model Arithmetic | Nov 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Diff-BGM: A Diffusion Model for Video Background Music Generation | May 20, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent | Jul 31, 2024 | Translationvalid | CodeCode Available | 2 |
| DVMSR: Distillated Vision Mamba for Efficient Super-Resolution | May 5, 2024 | Image Super-ResolutionLong-range modeling | CodeCode Available | 2 |
| Open-Set Domain Adaptation for Semantic Segmentation | May 30, 2024 | Domain AdaptationSemantic Segmentation | CodeCode Available | 2 |
| EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection | Feb 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review | Oct 4, 2024 | Knowledge DistillationLogical Reasoning | CodeCode Available | 2 |
| GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction | Feb 25, 2024 | 3D ReconstructionActive 3D Reconstruction | CodeCode Available | 2 |
| R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing | May 27, 2025 | Math | CodeCode Available | 2 |
| MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Sep 11, 2023 | MathMathematical Reasoning | CodeCode Available | 2 |
| PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking | Jul 27, 2023 | DiversityPoint Tracking | CodeCode Available | 2 |
| ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning | Dec 18, 2024 | | CodeCode Available | 2 |
| Decoding speech perception from non-invasive brain recordings | Aug 25, 2022 | Contrastive LearningEEG | CodeCode Available | 2 |
| Towards Explanation for Unsupervised Graph-Level Representation Learning | May 20, 2022 | Decision MakingGraph Classification | CodeCode Available | 2 |
| Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning | May 27, 2024 | Gym halfcheetah-mediumGym halfcheetah-medium-expert | CodeCode Available | 2 |
| Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach | Jan 28, 2024 | Image Outpainting | CodeCode Available | 2 |
| Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs | Mar 19, 2024 | Few-Shot LearningSelf-Supervised Learning | CodeCode Available | 2 |
| Rank1: Test-Time Compute for Reranking in Information Retrieval | Feb 25, 2025 | Information RetrievalInstruction Following | CodeCode Available | 2 |
| Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Aug 30, 2024 | Deep LearningImage Segmentation | CodeCode Available | 2 |
| FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling | Oct 15, 2021 | Semi-Supervised Image Classification | CodeCode Available | 2 |
| SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks | May 4, 2023 | | CodeCode Available | 2 |
| InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds | Dec 20, 2022 | | CodeCode Available | 2 |
| SelfGNN: Self-Supervised Graph Neural Networks for Sequential Recommendation | May 31, 2024 | Graph Neural NetworkRecommendation Systems | CodeCode Available | 2 |
| Large Language Models to Enhance Bayesian Optimization | Feb 6, 2024 | Bayesian OptimizationFew-Shot Learning | CodeCode Available | 2 |
| Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and Deblurring | Jan 1, 2024 | Deblurring | CodeCode Available | 2 |
| A Survey on Federated Fine-tuning of Large Language Models | Mar 15, 2025 | Federated Learningparameter-efficient fine-tuning | CodeCode Available | 2 |
| Large-Scale Data Selection for Instruction Tuning | Mar 3, 2025 | | CodeCode Available | 2 |
| AIN: The Arabic INclusive Large Multimodal Model | Jan 31, 2025 | document understandingmodel | CodeCode Available | 2 |
| MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action | Mar 20, 2023 | Multimodal ReasoningVisual Question Answering | CodeCode Available | 2 |
| Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition | Jan 4, 2024 | AttributeAudio Classification | CodeCode Available | 2 |
| Scaling Data-Constrained Language Models | May 25, 2023 | | CodeCode Available | 2 |
| Residual Denoising Diffusion Models | Aug 25, 2023 | DenoisingDiversity | CodeCode Available | 2 |
| Empower Structure-Based Molecule Optimization with Gradient Guided Bayesian Flow Networks | Nov 20, 2024 | Bayesian InferenceDrug Design | CodeCode Available | 2 |
| One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models | Mar 4, 2024 | Adversarial AttackAdversarial Robustness | CodeCode Available | 2 |
| SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System | Dec 4, 2023 | Computational Efficiency | CodeCode Available | 2 |
| Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video Anomaly | Dec 10, 2024 | | CodeCode Available | 2 |
| MMA: Multi-Modal Adapter for Vision-Language Models | Jan 1, 2024 | Domain GeneralizationGeneral Knowledge | CodeCode Available | 2 |
| Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring | Jun 11, 2024 | DeblurringOptical Flow Estimation | CodeCode Available | 2 |
| Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image Classification | Jan 11, 2024 | Contrastive Learningimage-classification | CodeCode Available | 2 |
| Clifford-Steerable Convolutional Neural Networks | Feb 22, 2024 | | CodeCode Available | 2 |
| Optimizing Anytime Reasoning via Budget Relative Policy Optimization | May 19, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 2 |
| UXsim: An open source macroscopic and mesoscopic traffic simulator in Python -- a technical overview | Sep 29, 2023 | | CodeCode Available | 2 |
| Elysium: Exploring Object-level Perception in Videos via MLLM | Mar 25, 2024 | ObjectObject Tracking | CodeCode Available | 2 |
| LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction | Jun 23, 2024 | | CodeCode Available | 2 |
| Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models | Jun 12, 2024 | Audio captioningHallucination | CodeCode Available | 2 |
| Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction | May 16, 2025 | Contrastive LearningSafety Alignment | CodeCode Available | 2 |
| Deep Learning and Foundation Models for Weather Prediction: A Survey | Jan 12, 2025 | Deep LearningPrediction | CodeCode Available | 2 |