| Breaking the Data Barrier -- Building GUI Agents Through Task Generalization | Apr 14, 2025 | Mathematical ReasoningMultimodal Reasoning | CodeCode Available | 1 |
| BO-SA-PINNs: Self-adaptive physics-informed neural networks based on Bayesian optimization for automatically designing PDE solvers | Apr 14, 2025 | Bayesian Optimization | CodeCode Available | 1 |
| MultiLoKo: a multilingual local knowledge benchmark for LLMs spanning 31 languages | Apr 14, 2025 | Transfer Learning | CodeCode Available | 1 |
| The Jailbreak Tax: How Useful are Your Jailbreak Outputs? | Apr 14, 2025 | Math | CodeCode Available | 1 |
| Beyond Degradation Redundancy: Contrastive Prompt Learning for All-in-One Image Restoration | Apr 14, 2025 | AllImage Restoration | CodeCode Available | 1 |
| LEMUR Neural Network Dataset: Towards Seamless AutoML | Apr 14, 2025 | AutoMLBenchmarking | CodeCode Available | 1 |
| Unveiling Contrastive Learning's Capability of Neighborhood Aggregation for Collaborative Filtering | Apr 14, 2025 | Collaborative FilteringContrastive Learning | CodeCode Available | 1 |
| ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models | Apr 14, 2025 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |
| Invariance Matters: Empowering Social Recommendation via Graph Invariant Learning | Apr 14, 2025 | DenoisingRecommendation Systems | CodeCode Available | 1 |
| Hearing Anywhere in Any Environment | Apr 14, 2025 | Mixed RealityRoom Impulse Response (RIR) | CodeCode Available | 1 |
| SilVar-Med: A Speech-Driven Visual Language Model for Explainable Abnormality Detection in Medical Imaging | Apr 14, 2025 | Anomaly DetectionDiagnostic | CodeCode Available | 1 |
| Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes | Apr 14, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 |
| GenTe: Generative Real-world Terrains for General Legged Robot Locomotion Control | Apr 14, 2025 | | CodeCode Available | 1 |
| Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution | Apr 14, 2025 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 1 |
| The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination | Apr 14, 2025 | Hallucination | CodeCode Available | 1 |
| GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions | Apr 14, 2025 | Image Generation | CodeCode Available | 1 |
| Better Estimation of the KL Divergence Between Language Models | Apr 14, 2025 | Knowledge Distillation | CodeCode Available | 1 |
| SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting | Apr 14, 2025 | Domain AdaptationText Detection | CodeCode Available | 1 |
| CHARM: Calibrating Reward Models With Chatbot Arena Scores | Apr 14, 2025 | Chatbot | CodeCode Available | 1 |
| DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training | Apr 13, 2025 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| Can LLM feedback enhance review quality? A randomized study of 20K reviews at ICLR 2025 | Apr 13, 2025 | | CodeCode Available | 1 |
| Uncertainty Guided Refinement for Fine-Grained Salient Object Detection | Apr 13, 2025 | object-detectionObject Detection | CodeCode Available | 1 |
| A Survey on Efficient Vision-Language Models | Apr 13, 2025 | Image CaptioningQuestion Answering | CodeCode Available | 1 |
| GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models | Apr 13, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation | Apr 13, 2025 | Drug Discovery | CodeCode Available | 1 |
| AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender | Apr 13, 2025 | Safety Alignment | CodeCode Available | 1 |
| SPICE: A Synergistic, Precise, Iterative, and Customizable Image Editing Workflow | Apr 13, 2025 | | CodeCode Available | 1 |
| EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety | Apr 13, 2025 | | CodeCode Available | 1 |
| Integrating Textual Embeddings from Contrastive Learning with Generative Recommender for Enhanced Personalization | Apr 13, 2025 | Contrastive LearningRecommendation Systems | CodeCode Available | 1 |
| Fine-tuning a Large Language Model for Automating Computational Fluid Dynamics Simulations | Apr 13, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |
| CMCRD: Cross-Modal Contrastive Representation Distillation for Emotion Recognition | Apr 12, 2025 | EEGEmotion Recognition | CodeCode Available | 1 |
| BioChemInsight: An Open-Source Toolkit for Automated Identification and Recognition of Optical Chemical Structures and Activity Data in Scientific Publications | Apr 12, 2025 | ArticlesDrug Design | CodeCode Available | 1 |
| NetTAG: A Multimodal RTL-and-Layout-Aligned Netlist Foundation Model via Text-Attributed Graph | Apr 12, 2025 | Graph LearningRepresentation Learning | CodeCode Available | 1 |
| Beyond Degradation Conditions: All-in-One Image Restoration via HOG Transformers | Apr 12, 2025 | AllImage Restoration | CodeCode Available | 1 |
| Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time | Apr 12, 2025 | model | CodeCode Available | 1 |
| Parameterized Synthetic Text Generation with SimpleStories | Apr 12, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System | Apr 12, 2025 | Information RetrievalRAG | CodeCode Available | 1 |
| RT-DATR:Real-time Unsupervised Domain Adaptive Detection Transformer with Adversarial Feature Learning | Apr 12, 2025 | Domain AdaptationDomain Generalization | CodeCode Available | 1 |
| On Oversquashing in Graph Neural Networks Through the Lens of Dynamical Systems | Apr 11, 2025 | | CodeCode Available | 1 |
| SN-LiDAR: Semantic Neural Fields for Novel Space-time View LiDAR Synthesis | Apr 11, 2025 | Autonomous DrivingNovel View Synthesis | CodeCode Available | 1 |
| Mimic In-Context Learning for Multimodal Tasks | Apr 11, 2025 | In-Context LearningVisual Question Answering (VQA) | CodeCode Available | 1 |
| PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction | Apr 11, 2025 | Surface Reconstruction | CodeCode Available | 1 |
| F^3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos | Apr 11, 2025 | Action UnderstandingEvent Detection | CodeCode Available | 1 |
| Task Memory Engine (TME): A Structured Memory Framework with Graph-Aware Extensions for Multi-Step LLM Agent Tasks | Apr 11, 2025 | | CodeCode Available | 1 |
| Towards generalizable single-cell perturbation modeling via the Conditional Monge Gap | Apr 11, 2025 | | CodeCode Available | 1 |
| Single View Garment Reconstruction Using Diffusion Mapping Via Pattern Coordinates | Apr 11, 2025 | 3D geometryGarment Reconstruction | CodeCode Available | 1 |
| MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation | Apr 11, 2025 | | CodeCode Available | 1 |
| Boosting the Class-Incremental Learning in 3D Point Clouds via Zero-Collection-Cost Basic Shape Pre-Training | Apr 11, 2025 | 3D geometryclass-incremental learning | CodeCode Available | 1 |
| LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs | Apr 11, 2025 | BenchmarkingImage Generation | CodeCode Available | 1 |
| Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging | Apr 11, 2025 | AttributeComputational Efficiency | CodeCode Available | 1 |