| VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks | Dec 24, 2024 | Common Sense ReasoningTransfer Learning | —Unverified | 0 |
| An Automatic Graph Construction Framework based on Large Language Models for Recommendation | Dec 24, 2024 | graph constructionQuantization | CodeCode Available | 1 |
| Knowledge Editing through Chain-of-Thought | Dec 23, 2024 | knowledge editingWorld Knowledge | CodeCode Available | 1 |
| Interweaving Memories of a Siamese Large Language Model | Dec 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Beyond Partisan Leaning: A Comparative Analysis of Political Bias in Large Language Models | Dec 21, 2024 | World Knowledge | —Unverified | 0 |
| Logical Consistency of Large Language Models in Fact-checking | Dec 20, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Fietje: An open, efficient LLM for Dutch | Dec 19, 2024 | Linguistic AcceptabilitySentiment Analysis | CodeCode Available | 2 |
| GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering | Dec 19, 2024 | Efficient ExplorationEmbodied Question Answering | —Unverified | 0 |
| MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark | Dec 19, 2024 | MMLUMultiple-choice | CodeCode Available | 2 |
| Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models | Dec 18, 2024 | Contrastive LearningKnowledge Graphs | CodeCode Available | 1 |
| AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge | Dec 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| MetaMorph: Multimodal Understanding and Generation via Instruction Tuning | Dec 18, 2024 | Instruction FollowingMORPH | —Unverified | 0 |
| HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction | Dec 17, 2024 | PredictionTrajectory Prediction | —Unverified | 0 |
| QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs | Dec 16, 2024 | BenchmarkingCommon Sense Reasoning | CodeCode Available | 0 |
| GaGA: Towards Interactive Global Geolocation Assistant | Dec 12, 2024 | World Knowledge | —Unverified | 0 |
| AltFS: Agency-light Feature Selection with Large Language Models in Deep Recommender Systems | Dec 11, 2024 | Feature Importancefeature selection | —Unverified | 0 |
| Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs | Dec 10, 2024 | Knowledge GraphsRAG | CodeCode Available | 1 |
| Balancing Efficiency and Effectiveness: An LLM-Infused Approach for Optimized CTR Prediction | Dec 9, 2024 | Click-Through Rate PredictionWorld Knowledge | —Unverified | 0 |
| Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach | Dec 9, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving | Dec 9, 2024 | Autonomous DrivingWorld Knowledge | —Unverified | 0 |
| I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token | Dec 9, 2024 | World Knowledge | CodeCode Available | 1 |
| Retrieval-Augmented Machine Translation with Unstructured Knowledge | Dec 5, 2024 | Knowledge GraphsMachine Translation | CodeCode Available | 1 |
| A surprisal oracle for when every layer counts | Dec 4, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 0 |
| SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model | Dec 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Realistic Corner Case Generation for Autonomous Vehicles with Multimodal Large Language Model | Nov 29, 2024 | Autonomous VehiclesLanguage Modeling | —Unverified | 0 |