| EventVAD: Training-Free Event-Aware Video Anomaly Detection | Apr 17, 2025 | Anomaly DetectionBoundary Detection | —Unverified | 0 |
| Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics | Apr 16, 2025 | Few-Shot LearningImage Manipulation | —Unverified | 0 |
| Rethinking LLM-Based Recommendations: A Query Generation-Based, Training-Free Approach | Apr 16, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| Coding-Prior Guided Diffusion Network for Video Deblurring | Apr 16, 2025 | DeblurringVideo Deblurring | —Unverified | 0 |
| ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search | Apr 15, 2025 | RAGRetrieval-augmented Generation | —Unverified | 0 |
| Enhancing LLM-based Recommendation through Semantic-Aligned Collaborative Knowledge | Apr 14, 2025 | Collaborative FilteringTransfer Learning | —Unverified | 0 |
| Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training | Apr 11, 2025 | AllLanguage Modeling | —Unverified | 0 |
| ConceptFormer: Towards Efficient Use of Knowledge-Graph Embeddings in Large Language Models | Apr 10, 2025 | Knowledge Graph EmbeddingsKnowledge Graphs | —Unverified | 0 |
| DiffusionCom: Structure-Aware Multimodal Diffusion Model for Multimodal Knowledge Graph Completion | Apr 9, 2025 | Graph AttentionKnowledge Graph Completion | —Unverified | 0 |
| Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability | Apr 9, 2025 | Image Generationmultimodal generation | —Unverified | 0 |
| Large Language Models Enhanced Hyperbolic Space Recommender Systems | Apr 8, 2025 | Contrastive LearningRecommendation Systems | —Unverified | 0 |
| Memory-Modular Classification: Learning to Generalize with Memory Replacement | Apr 8, 2025 | Classificationimage-classification | CodeCode Available | 0 |
| User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems | Apr 7, 2025 | DiversityRecommendation Systems | —Unverified | 0 |
| RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model | Apr 7, 2025 | Image Captioningimage-classification | —Unverified | 0 |
| Adaptive Elicitation of Latent Information Using Natural Language | Apr 5, 2025 | Uncertainty QuantificationWorld Knowledge | —Unverified | 0 |
| Knowledge Graph Completion with Mixed Geometry Tensor Factorization | Apr 3, 2025 | Knowledge Graph CompletionKnowledge Graphs | CodeCode Available | 0 |
| GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation | Apr 3, 2025 | Image GenerationWorld Knowledge | CodeCode Available | 3 |
| F-ViTA: Foundation Model Guided Visible to Thermal Translation | Apr 3, 2025 | Scene UnderstandingStyle Transfer | CodeCode Available | 1 |
| OnRL-RAG: Real-Time Personalized Mental Health Dialogue System | Apr 2, 2025 | RAGRetrieval | —Unverified | 0 |
| A Diffusion-Based Framework for Occluded Object Movement | Apr 2, 2025 | ObjectWorld Knowledge | —Unverified | 0 |
| Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval | Apr 2, 2025 | General KnowledgeRetrieval | —Unverified | 0 |
| Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors | Mar 26, 2025 | Depth EstimationWorld Knowledge | CodeCode Available | 1 |
| LLM-based Agent Simulation for Maternal Health Interventions: Uncertainty Estimation and Decision-focused Evaluation | Mar 25, 2025 | counterfactualDecision Making | CodeCode Available | 0 |
| Test-Time Reasoning Through Visual Human Preferences with VLMs and Soft Rewards | Mar 25, 2025 | World Knowledge | —Unverified | 0 |
| Human-Object Interaction with Vision-Language Model Guided Relative Movement Dynamics | Mar 24, 2025 | Human-Object Interaction DetectionLanguage Modeling | —Unverified | 0 |
| Instructing the Architecture Search for Spatial-temporal Sequence Forecasting with LLM | Mar 23, 2025 | Neural Architecture SearchPrompt Engineering | —Unverified | 0 |
| A Study into Investigating Temporal Robustness of LLMs | Mar 21, 2025 | Question AnsweringWorld Knowledge | —Unverified | 0 |
| Advancing Problem-Based Learning in Biomedical Engineering in the Era of Generative AI | Mar 20, 2025 | World Knowledge | —Unverified | 0 |
| World Knowledge from AI Image Generation for Robot Control | Mar 20, 2025 | Image GenerationWorld Knowledge | —Unverified | 0 |
| JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse | Mar 20, 2025 | Decision MakingImitation Learning | —Unverified | 0 |
| Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training | Mar 19, 2025 | Image DehazingWorld Knowledge | CodeCode Available | 1 |
| FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification | Mar 18, 2025 | Combinatorial OptimizationContrastive Learning | CodeCode Available | 1 |
| Impossible Videos | Mar 18, 2025 | counterfactualVideo Generation | —Unverified | 0 |
| A Multi-Stage Framework with Taxonomy-Guided Reasoning for Occupation Classification Using Large Language Models | Mar 17, 2025 | ClassificationIn-Context Learning | —Unverified | 0 |
| Free-form language-based robotic reasoning and grasping | Mar 17, 2025 | FormRobotic Grasping | CodeCode Available | 2 |
| A Framework for a Capability-driven Evaluation of Scenario Understanding for Multimodal Large Language Models in Autonomous Driving | Mar 14, 2025 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs? | Mar 13, 2025 | NavigateWorld Knowledge | CodeCode Available | 0 |
| SySLLM: Generating Synthesized Policy Summaries for Reinforcement Learning Agents Using Large Language Models | Mar 13, 2025 | Reinforcement Learning (RL)World Knowledge | —Unverified | 0 |
| LREF: A Novel LLM-based Relevance Framework for E-commerce | Mar 12, 2025 | World Knowledge | —Unverified | 0 |
| WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation | Mar 10, 2025 | Common Sense ReasoningImage Generation | CodeCode Available | 4 |
| PointVLA: Injecting the 3D World into Vision-Language-Action Models | Mar 10, 2025 | Imitation LearningSpatial Reasoning | CodeCode Available | 4 |
| The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence | Mar 7, 2025 | Logical ReasoningWorld Knowledge | —Unverified | 0 |
| Effective LLM Knowledge Learning via Model Generalization | Mar 5, 2025 | Data Augmentationmodel | —Unverified | 0 |
| From Language to Cognition: How LLMs Outgrow the Human Language Network | Mar 3, 2025 | World Knowledge | —Unverified | 0 |
| Can Large Language Models Help Experimental Design for Causal Discovery? | Mar 3, 2025 | Causal DiscoveryExperimental Design | —Unverified | 0 |
| 3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds | Feb 27, 2025 | Affordance DetectionHuman-Object Interaction Detection | —Unverified | 0 |
| FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge | Feb 26, 2025 | World Knowledge | —Unverified | 0 |
| Data-Efficient Multi-Agent Spatial Planning with LLMs | Feb 26, 2025 | Decision MakingWorld Knowledge | —Unverified | 0 |
| BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle | Feb 22, 2025 | World Knowledge | CodeCode Available | 0 |
| LLM4Tag: Automatic Tagging System for Information Retrieval via Large Language Models | Feb 19, 2025 | Information RetrievalRecommendation Systems | —Unverified | 0 |