| LARGE: Legal Retrieval Augmented Generation Evaluation Tool | Apr 2, 2025 | RAGRetrieval | CodeCode Available | 2 |
| Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment | Apr 2, 2025 | 3DGSNeRF | CodeCode Available | 2 |
| MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits | Apr 2, 2025 | | CodeCode Available | 2 |
| Efficient Federated Learning Tiny Language Models for Mobile Network Feature Prediction | Apr 2, 2025 | Federated Learning | CodeCode Available | 2 |
| An Illusion of Progress? Assessing the Current State of Web Agents | Apr 2, 2025 | | CodeCode Available | 2 |
| shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python | Apr 2, 2025 | | CodeCode Available | 2 |
| FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning | Apr 1, 2025 | Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA) | CodeCode Available | 2 |
| Z1: Efficient Test-time Scaling with Code | Apr 1, 2025 | | CodeCode Available | 2 |
| Learned Image Compression with Dictionary-based Entropy Model | Apr 1, 2025 | Image Compressionmodel | CodeCode Available | 2 |
| A Decade of Deep Learning for Remote Sensing Spatiotemporal Fusion: Advances, Challenges, and Opportunities | Apr 1, 2025 | | CodeCode Available | 2 |
| Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection | Apr 1, 2025 | | CodeCode Available | 2 |
| CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models | Apr 1, 2025 | Large Language ModelTranslation | CodeCode Available | 2 |
| OpenFACADES: An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View Imagery | Apr 1, 2025 | Attribute | CodeCode Available | 2 |
| Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space | Mar 31, 2025 | Cloud RemovalDenoising | CodeCode Available | 2 |
| Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs | Mar 31, 2025 | Large Language ModelVideo Chaptering | CodeCode Available | 2 |
| Every Painting Awakened: A Training-free Framework for Painting-to-Animation Generation | Mar 31, 2025 | Image to Video Generation | CodeCode Available | 2 |
| Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks | Mar 31, 2025 | Numerical Integration | CodeCode Available | 2 |
| Training-Free Text-Guided Image Editing with Visual Autoregressive Model | Mar 31, 2025 | text-guided-image-editing | CodeCode Available | 2 |
| SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency | Mar 31, 2025 | Zero-Shot Learning | CodeCode Available | 2 |
| Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead | Mar 31, 2025 | MathSpatial Reasoning | CodeCode Available | 2 |
| Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 | Mar 31, 2025 | Logical ReasoningMultiple-choice | CodeCode Available | 2 |
| AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | Mar 31, 2025 | Robot ManipulationScheduling | CodeCode Available | 2 |
| On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices | Mar 31, 2025 | DenoisingModel Optimization | CodeCode Available | 2 |
| A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well? | Mar 31, 2025 | | CodeCode Available | 2 |
| TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection | Mar 31, 2025 | Fraud DetectionLarge Language Model | CodeCode Available | 2 |
| Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models | Mar 31, 2025 | | CodeCode Available | 2 |
| THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models | Mar 31, 2025 | GPU | CodeCode Available | 2 |
| Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning | Mar 31, 2025 | General Reinforcement LearningInstruction Following | CodeCode Available | 2 |
| Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement | Mar 31, 2025 | HallucinationRAG | CodeCode Available | 2 |
| Optimal Invariant Bases for Atomistic Machine Learning | Mar 30, 2025 | | CodeCode Available | 2 |
| RARE: Retrieval-Augmented Reasoning Modeling | Mar 30, 2025 | HallucinationMemorization | CodeCode Available | 2 |
| TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes | Mar 30, 2025 | 2kImage Generation | CodeCode Available | 2 |
| FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning | Mar 30, 2025 | 2kGPU | CodeCode Available | 2 |
| Graph ODEs and Beyond: A Comprehensive Survey on Integrating Differential Equations with Graph Neural Networks | Mar 29, 2025 | SurveyTraffic Prediction | CodeCode Available | 2 |
| OncoReg: Medical Image Registration for Oncological Challenges | Mar 29, 2025 | Image RegistrationMedical Image Registration | CodeCode Available | 2 |
| From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D | Mar 29, 2025 | Spatial Reasoning | CodeCode Available | 2 |
| Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection | Mar 29, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| FreeSplat++: Generalizable 3D Gaussian Splatting for Efficient Indoor Scene Reconstruction | Mar 29, 2025 | 3DGSIndoor Scene Reconstruction | CodeCode Available | 2 |
| CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models | Mar 28, 2025 | GPUGSM8K | CodeCode Available | 2 |
| Q-Insight: Understanding Image Quality via Visual Reinforcement Learning | Mar 28, 2025 | DescriptiveImage Quality Assessment | CodeCode Available | 2 |
| Unicorn: Text-Only Data Synthesis for Vision Language Model Training | Mar 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models | Mar 28, 2025 | MMLUQuantization | CodeCode Available | 2 |
| Learning to Reason for Long-Form Story Generation | Mar 28, 2025 | FormMath | CodeCode Available | 2 |
| A Survey on Remote Sensing Foundation Models: From Vision to Multimodality | Mar 28, 2025 | Change DetectionLand Cover Classification | CodeCode Available | 2 |
| Harmonizing Visual Representations for Unified Multimodal Understanding and Generation | Mar 27, 2025 | Image GenerationQuantization | CodeCode Available | 2 |
| A Unified Image-Dense Annotation Generation Model for Underwater Scenes | Mar 27, 2025 | Depth EstimationPrediction | CodeCode Available | 2 |
| Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model | Mar 27, 2025 | EgoSchemaLanguage Modeling | CodeCode Available | 2 |
| UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning | Mar 27, 2025 | Model OptimizationReinforcement Learning (RL) | CodeCode Available | 2 |
| Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks | Mar 27, 2025 | Imitation LearningMathematical Reasoning | CodeCode Available | 2 |
| Datasets for Depression Modeling in Social Media: An Overview | Mar 27, 2025 | | CodeCode Available | 2 |