| RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics | May 18, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| DragLoRA: Online Optimization of LoRA Adapters for Drag-based Image Editing in Diffusion Model | May 18, 2025 | Computational EfficiencyDenoising | CodeCode Available | 1 |
| Relation Extraction or Pattern Matching? Unravelling the Generalisation Limits of Language Models for Biographical RE | May 18, 2025 | In-Context LearningRelation | CodeCode Available | 1 |
| Video-GPT via Next Clip Diffusion | May 18, 2025 | DenoisingImage Animation | CodeCode Available | 1 |
| MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks | May 18, 2025 | BenchmarkingMedical Visual Question Answering | CodeCode Available | 1 |
| BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs | May 18, 2025 | Logical Reasoning | CodeCode Available | 1 |
| Towards Reliable and Interpretable Traffic Crash Pattern Prediction and Safety Interventions Using Customized Large Language Models | May 18, 2025 | | CodeCode Available | 1 |
| ProMi: An Efficient Prototype-Mixture Baseline for Few-Shot Segmentation with Bounding-Box Annotations | May 18, 2025 | | CodeCode Available | 1 |
| Spectral-Spatial Self-Supervised Learning for Few-Shot Hyperspectral Image Classification | May 18, 2025 | Classification Of Hyperspectral ImagesDiversity | CodeCode Available | 1 |
| RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction | May 18, 2025 | Vision-Language-Action | CodeCode Available | 1 |
| Is Artificial Intelligence Generated Image Detection a Solved Problem? | May 18, 2025 | Data AugmentationImage Generation | CodeCode Available | 1 |
| Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts | May 18, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy | May 18, 2025 | Cloud ComputingDeep Reinforcement Learning | CodeCode Available | 1 |
| LLM-DSE: Searching Accelerator Parameters with LLM Agents | May 18, 2025 | High-Level Synthesis | CodeCode Available | 1 |
| LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images? | May 18, 2025 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 1 |
| What are they talking about? Benchmarking Large Language Models for Knowledge-Grounded Discussion Summarization | May 18, 2025 | Benchmarking | CodeCode Available | 1 |
| Visuospatial Cognitive Assistant | May 18, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| Temporal-Spectral-Spatial Unified Remote Sensing Dense Prediction | May 18, 2025 | Change DetectionPrediction | CodeCode Available | 1 |
| Efficient RL Training for Reasoning Models via Length-Aware Optimization | May 18, 2025 | Math | CodeCode Available | 1 |
| Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning | May 18, 2025 | GSM8KIn-Context Learning | CodeCode Available | 1 |
| Always Clear Depth: Robust Monocular Depth Estimation under Adverse Weather | May 18, 2025 | Autonomous DrivingDepth Estimation | CodeCode Available | 1 |
| MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation | May 17, 2025 | Image SegmentationMamba | CodeCode Available | 1 |
| Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model | May 17, 2025 | Computational EfficiencySelf-Learning | CodeCode Available | 1 |
| SepPrune: Structured Pruning for Efficient Deep Speech Separation | May 17, 2025 | channel selectionComputational Efficiency | CodeCode Available | 1 |
| Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM Evaluation | May 17, 2025 | Dataset GenerationGPU | CodeCode Available | 1 |
| ELITE: Embedding-Less retrieval with Iterative Text Exploration | May 17, 2025 | graph constructionRAG | CodeCode Available | 1 |
| LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation | May 17, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation | May 17, 2025 | Code Generation | CodeCode Available | 1 |
| HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems | May 17, 2025 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing | May 17, 2025 | Chart Understanding | CodeCode Available | 1 |
| FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge | May 17, 2025 | Image GenerationScheduling | CodeCode Available | 1 |
| DC-Seg: Disentangled Contrastive Learning for Brain Tumor Segmentation with Missing Modalities | May 17, 2025 | Brain Tumor SegmentationContrastive Learning | CodeCode Available | 1 |
| VenusX: Unlocking Fine-Grained Functional Understanding of Proteins | May 17, 2025 | Binary ClassificationMulti-class Classification | CodeCode Available | 1 |
| BINAQUAL: A Full-Reference Objective Localization Similarity Metric for Binaural Audio | May 17, 2025 | | CodeCode Available | 1 |
| Neuro-Symbolic Query Compiler | May 17, 2025 | RAGResponse Generation | CodeCode Available | 1 |
| Multimodal Cancer Survival Analysis via Hypergraph Learning with Cross-Modality Rebalance | May 17, 2025 | Survival AnalysisSurvival Prediction | CodeCode Available | 1 |
| InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction | May 16, 2025 | | CodeCode Available | 1 |
| Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation | May 16, 2025 | HallucinationRAG | CodeCode Available | 1 |
| IRLBench: A Multi-modal, Culturally Grounded, Parallel Irish-English Benchmark for Open-Ended LLM Reasoning Evaluation | May 16, 2025 | Multiple-choice | CodeCode Available | 1 |
| Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation | May 16, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Accurate KV Cache Quantization with Outlier Tokens Tracing | May 16, 2025 | Quantization | CodeCode Available | 1 |
| Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory | May 16, 2025 | | CodeCode Available | 1 |
| Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining | May 16, 2025 | Community DetectionContrastive Learning | CodeCode Available | 1 |
| X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation | May 16, 2025 | Diversity | CodeCode Available | 1 |
| msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML | May 16, 2025 | | CodeCode Available | 1 |
| One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework | May 16, 2025 | AttributeImage Generation | CodeCode Available | 1 |
| Unifying Segment Anything in Microscopy with Multimodal Large Language Model | May 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports | May 16, 2025 | DiagnosticMath | CodeCode Available | 1 |
| DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios | May 16, 2025 | Malware Analysis | CodeCode Available | 1 |
| PoE-World: Compositional World Modeling with Products of Programmatic Experts | May 16, 2025 | Montezuma's RevengeProgram Synthesis | CodeCode Available | 1 |