| SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence | Jun 9, 2025 | | CodeCode Available | 1 |
| Premise Selection for a Lean Hammer | Jun 9, 2025 | | CodeCode Available | 1 |
| WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning | Jun 9, 2025 | MathMathematical Reasoning | CodeCode Available | 1 |
| SEED: Enhancing Text-to-SQL Performance and Practical Usability Through Automatic Evidence Generation | Jun 9, 2025 | Natural Language QueriesText to SQL | CodeCode Available | 1 |
| FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity | Jun 9, 2025 | Motion Segmentation | CodeCode Available | 1 |
| MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models | Jun 9, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds | Jun 9, 2025 | 3D Semantic SegmentationSegmentation | CodeCode Available | 1 |
| Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces | Jun 9, 2025 | Image GenerationText Generation | CodeCode Available | 1 |
| From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium | Jun 9, 2025 | Hierarchical Reinforcement Learning | CodeCode Available | 1 |
| Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text | Jun 8, 2025 | Instruction Following | CodeCode Available | 1 |
| Certified Unlearning for Neural Networks | Jun 8, 2025 | Machine Unlearning | CodeCode Available | 1 |
| Learning Compact Vision Tokens for Efficient Large Multimodal Models | Jun 8, 2025 | Multimodal ReasoningToken Reduction | CodeCode Available | 1 |
| Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings | Jun 8, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint | Jun 8, 2025 | | CodeCode Available | 1 |
| Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification | Jun 8, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| SAFE: Finding Sparse and Flat Minima to Improve Pruning | Jun 7, 2025 | image-classificationImage Classification | CodeCode Available | 1 |
| Depth-Optimal Quantum Layout Synthesis as SAT | Jun 7, 2025 | | CodeCode Available | 1 |
| DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration | Jun 6, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |
| STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving | Jun 6, 2025 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |
| Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias | Jun 6, 2025 | image-classificationImage Classification | CodeCode Available | 1 |
| 3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model | Jun 6, 2025 | Optical Flow EstimationRobot Manipulation | CodeCode Available | 1 |
| LETS Forecast: Learning Embedology for Time Series Forecasting | Jun 6, 2025 | Future predictionTime Series | CodeCode Available | 1 |
| Revealing hidden correlations from complex spatial distributions: Adjacent Correlation Analysis | Jun 6, 2025 | | CodeCode Available | 1 |
| Mapping correlations and coherence: adjacency-based approach to data visualization and regularity discovery | Jun 6, 2025 | Data Visualization | CodeCode Available | 1 |
| SDS-Net: Shallow-Deep Synergism-detection Network for infrared small target detection | Jun 6, 2025 | Computational Efficiency | CodeCode Available | 1 |
| Towards an Explainable Comparison and Alignment of Feature Embeddings | Jun 6, 2025 | | CodeCode Available | 1 |
| FinanceReasoning: Benchmarking Financial Numerical Reasoning More Credible, Comprehensive and Challenging | Jun 6, 2025 | Benchmarking | CodeCode Available | 1 |
| Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning | Jun 6, 2025 | Lifelong learningparameter-efficient fine-tuning | CodeCode Available | 1 |
| DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation | Jun 6, 2025 | Code Generation | CodeCode Available | 1 |
| Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties | Jun 6, 2025 | GSM8K | CodeCode Available | 1 |
| KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes | Jun 6, 2025 | Code GenerationData Integration | CodeCode Available | 1 |
| NILMFormer: Non-Intrusive Load Monitoring that Accounts for Non-Stationarity | Jun 6, 2025 | Non-Intrusive Load Monitoring | CodeCode Available | 1 |
| Joint-GCG: Unified Gradient-Based Poisoning Attacks on Retrieval-Augmented Generation Systems | Jun 6, 2025 | RAGRetrieval | CodeCode Available | 1 |
| Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models | Jun 6, 2025 | | CodeCode Available | 1 |
| FADE: Frequency-Aware Diffusion Model Factorization for Video Editing | Jun 6, 2025 | Video Editing | CodeCode Available | 1 |
| AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification | Jun 6, 2025 | Diversity | CodeCode Available | 1 |
| Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation | Jun 6, 2025 | Decoder | CodeCode Available | 1 |
| Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models | Jun 5, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| MTPNet: Multi-Grained Target Perception for Unified Activity Cliff Prediction | Jun 5, 2025 | Drug DiscoveryPrediction | CodeCode Available | 1 |
| Progressive Tempering Sampler with Diffusion | Jun 5, 2025 | | CodeCode Available | 1 |
| OGGSplat: Open Gaussian Growing for Generalizable Reconstruction with Expanded Field-of-View | Jun 5, 2025 | 3D Reconstruction | CodeCode Available | 1 |
| Advancing Tool-Augmented Large Language Models via Meta-Verification and Reflection Learning | Jun 5, 2025 | Imitation Learning | CodeCode Available | 1 |
| Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts | Jun 5, 2025 | GPUScheduling | CodeCode Available | 1 |
| Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations | Jun 5, 2025 | 4kSpatial Reasoning | CodeCode Available | 1 |
| OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model | Jun 5, 2025 | Instance SegmentationLanguage Modeling | CodeCode Available | 1 |
| FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation | Jun 5, 2025 | DenoisingVideo Generation | CodeCode Available | 1 |
| Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification | Jun 5, 2025 | Automated Theorem ProvingHallucination | CodeCode Available | 1 |
| Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay | Jun 5, 2025 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games | Jun 5, 2025 | Action GenerationAsynchronous Group Communication | CodeCode Available | 1 |
| Tuning the Right Foundation Models is What you Need for Partial Label Learning | Jun 5, 2025 | Model SelectionPartial Label Learning | CodeCode Available | 1 |