| Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments | May 8, 2025 | BenchmarkingPrompt Engineering | CodeCode Available | 1 |
| Crosslingual Reasoning through Test-Time Scaling | May 8, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 |
| EquiHGNN: Scalable Rotationally Equivariant Hypergraph Neural Networks | May 8, 2025 | | CodeCode Available | 1 |
| HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics | May 8, 2025 | parameter estimationUncertainty Quantification | CodeCode Available | 1 |
| Physics-Assisted and Topology-Informed Deep Learning for Weather Prediction | May 8, 2025 | Deep LearningGraph Neural Network | CodeCode Available | 1 |
| Augmented Deep Contexts for Spatially Embedded Video Coding | May 8, 2025 | | CodeCode Available | 1 |
| X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP | May 8, 2025 | | CodeCode Available | 1 |
| PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models | May 8, 2025 | BenchmarkingGraph Representation Learning | CodeCode Available | 1 |
| Griffin: Towards a Graph-Centric Relational Database Foundation Model | May 8, 2025 | DecoderDiversity | CodeCode Available | 1 |
| Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration | May 8, 2025 | Deep Reinforcement LearningMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction | May 8, 2025 | BenchmarkingDrug Discovery | CodeCode Available | 1 |
| The City that Never Settles: Simulation-based LiDAR Dataset for Long-Term Place Recognition Under Extreme Structural Changes | May 8, 2025 | | CodeCode Available | 1 |
| A Preliminary Study for GPT-4o on Image Restoration | May 8, 2025 | Image DehazingImage Generation | CodeCode Available | 1 |
| A Simple Detector with Frame Dynamics is a Strong Tracker | May 8, 2025 | Objectobject-detection | CodeCode Available | 1 |
| Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization | May 8, 2025 | Scene UnderstandingSound Source Localization | CodeCode Available | 1 |
| ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior | May 8, 2025 | Room Impulse Response (RIR)Speech Separation | CodeCode Available | 1 |
| Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding | May 8, 2025 | document understandingInstruction Following | CodeCode Available | 1 |
| UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model | May 8, 2025 | Semantic SegmentationUncertainty Quantification | CodeCode Available | 1 |
| KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification | May 8, 2025 | Knowledge GraphsRAG | CodeCode Available | 1 |
| Scalable Chain of Thoughts via Elastic Reasoning | May 8, 2025 | | CodeCode Available | 1 |
| FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting | May 7, 2025 | Computational EfficiencyMultivariate Time Series Forecasting | CodeCode Available | 1 |
| VideoPath-LLaVA: Pathology Diagnostic Reasoning Through Video Instruction Tuning | May 7, 2025 | Decision MakingDiagnostic | CodeCode Available | 1 |
| TS-Diff: Two-Stage Diffusion Model for Low-Light RAW Image Enhancement | May 7, 2025 | DenoisingImage Enhancement | CodeCode Available | 1 |
| Componential Prompt-Knowledge Alignment for Domain Incremental Learning | May 7, 2025 | Incremental LearningTransfer Learning | CodeCode Available | 1 |
| Histo-Miner: Deep Learning based Tissue Features Extraction Pipeline from H&E Whole Slide Images of Cutaneous Squamous Cell Carcinoma | May 7, 2025 | Segmentationwhole slide images | CodeCode Available | 1 |
| TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution | May 7, 2025 | DiversityPrediction | CodeCode Available | 1 |
| WDMamba: When Wavelet Degradation Prior Meets Vision Mamba for Image Dehazing | May 7, 2025 | Image DehazingMamba | CodeCode Available | 1 |
| RGB-Event Fusion with Self-Attention for Collision Prediction | May 7, 2025 | BenchmarkingComputational Efficiency | CodeCode Available | 1 |
| EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events | May 7, 2025 | Space-time Video Super-resolutionSuper-Resolution | CodeCode Available | 1 |
| Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective | May 7, 2025 | object-detectionObject Detection | CodeCode Available | 1 |
| LLAMAPIE: Proactive In-Ear Conversation Assistants | May 7, 2025 | | CodeCode Available | 1 |
| Retrieval Augmented Time Series Forecasting | May 7, 2025 | RetrievalTime Series | CodeCode Available | 1 |
| Registration of 3D Point Sets Using Exponential-based Similarity Matrix | May 7, 2025 | Point Cloud Registration | CodeCode Available | 1 |
| Image Restoration via Multi-domain Learning | May 7, 2025 | Cloud RemovalDeblurring | CodeCode Available | 1 |
| ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence | May 7, 2025 | Knowledge Distillation | CodeCode Available | 1 |
| Nature's Insight: A Novel Framework and Comprehensive Analysis of Agentic Reasoning Through the Lens of Neuroscience | May 7, 2025 | | CodeCode Available | 1 |
| Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards | May 7, 2025 | Text to SQLText-To-SQL | CodeCode Available | 1 |
| Vision Graph Prompting via Semantic Low-Rank Decomposition | May 7, 2025 | parameter-efficient fine-tuningVisual Prompting | CodeCode Available | 1 |
| Benchmarking LLMs' Swarm intelligence | May 7, 2025 | Benchmarking | CodeCode Available | 1 |
| DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once | May 7, 2025 | AllAutonomous Driving | CodeCode Available | 1 |
| Object-Shot Enhanced Grounding Network for Egocentric Video | May 7, 2025 | Video Grounding | CodeCode Available | 1 |
| GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model | May 7, 2025 | parameter-efficient fine-tuning | CodeCode Available | 1 |
| Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards | May 7, 2025 | BenchmarkingHallucination | CodeCode Available | 1 |
| Token Communication-Driven Multimodal Large Models in Resource-Constrained Multiuser Networks | May 6, 2025 | | CodeCode Available | 1 |
| Learning-based Homothetic Tube MPC | May 6, 2025 | Model Predictive Control | CodeCode Available | 1 |
| WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch | May 6, 2025 | | CodeCode Available | 1 |
| IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages | May 6, 2025 | Question Answering | CodeCode Available | 1 |
| OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents | May 6, 2025 | | CodeCode Available | 1 |
| 1^st Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge | May 6, 2025 | Click-Through Rate PredictionRecommendation Systems | CodeCode Available | 1 |