| Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning | May 25, 2025 | Out-of-Distribution Generalizationreinforcement-learning | —Unverified | 0 |
| DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving | May 25, 2025 | Autonomous DrivingImage Generation | —Unverified | 0 |
| Alchemist: Turning Public Text-to-Image Data into Generative Gold | May 25, 2025 | World Knowledge | —Unverified | 0 |
| GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains | May 24, 2025 | geo-localizationVisual Reasoning | CodeCode Available | 1 |
| Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation | May 24, 2025 | Image GenerationText to Image Generation | CodeCode Available | 0 |
| Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs? | May 23, 2025 | text-classificationText Classification | —Unverified | 0 |
| DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation | May 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| O^2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering | May 22, 2025 | Answer GenerationOpen-Ended Question Answering | CodeCode Available | 1 |
| TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models | May 21, 2025 | Human AgingQuestion Answering | CodeCode Available | 0 |
| Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets | May 21, 2025 | Dataset GenerationDescriptive | —Unverified | 0 |
| UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models | May 21, 2025 | Machine UnlearningModel Editing | CodeCode Available | 0 |
| Table Foundation Models: on knowledge pre-training for tabular learning | May 20, 2025 | World Knowledge | —Unverified | 0 |
| Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection | May 18, 2025 | MemorizationWorld Knowledge | CodeCode Available | 0 |
| Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges | May 16, 2025 | BenchmarkingState Estimation | CodeCode Available | 0 |
| Who You Are Matters: Bridging Topics and Social Roles via LLM-Enhanced Logical Recommendation | May 16, 2025 | General KnowledgeLarge Language Model | —Unverified | 0 |
| LODGE: Joint Hierarchical Task Planning and Learning of Domain Models with Grounded Execution | May 15, 2025 | Robot ManipulationTask Planning | —Unverified | 0 |
| LLM4CD: Leveraging Large Language Models for Open-World Knowledge Augmented Cognitive Diagnosis | May 14, 2025 | cognitive diagnosisWorld Knowledge | CodeCode Available | 0 |
| Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression for Scalable Knowledge Integration | May 13, 2025 | RAGRetrieval | —Unverified | 0 |
| Advancing and Benchmarking Personalized Tool Invocation for LLMs | May 7, 2025 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| Evaluating Contrastive Feedback for Effective User Simulations | May 5, 2025 | Information RetrievalPrompt Engineering | CodeCode Available | 0 |
| WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation | May 2, 2025 | Image GenerationText to Image Generation | —Unverified | 0 |
| Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers | Apr 29, 2025 | Data AugmentationKnowledge Graphs | —Unverified | 0 |
| Towards Automated Scoping of AI for Social Good Projects | Apr 28, 2025 | World Knowledge | —Unverified | 0 |
| Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models | Apr 27, 2025 | Visual ReasoningWorld Knowledge | —Unverified | 0 |
| WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion | Apr 18, 2025 | Contrastive LearningDenoising | CodeCode Available | 1 |