| Logical Consistency of Large Language Models in Fact-checking | Dec 20, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Fietje: An open, efficient LLM for Dutch | Dec 19, 2024 | Linguistic AcceptabilitySentiment Analysis | CodeCode Available | 2 |
| MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark | Dec 19, 2024 | MMLUMultiple-choice | CodeCode Available | 2 |
| GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering | Dec 19, 2024 | Efficient ExplorationEmbodied Question Answering | —Unverified | 0 |
| MetaMorph: Multimodal Understanding and Generation via Instruction Tuning | Dec 18, 2024 | Instruction FollowingMORPH | —Unverified | 0 |
| Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models | Dec 18, 2024 | Contrastive LearningKnowledge Graphs | CodeCode Available | 1 |
| AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge | Dec 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction | Dec 17, 2024 | PredictionTrajectory Prediction | —Unverified | 0 |
| QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs | Dec 16, 2024 | BenchmarkingCommon Sense Reasoning | CodeCode Available | 0 |
| GaGA: Towards Interactive Global Geolocation Assistant | Dec 12, 2024 | World Knowledge | —Unverified | 0 |