| How Does Code Pretraining Affect Language Model Task Performance? | Sep 6, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Physical Rule-Guided Convolutional Neural Network | Sep 3, 2024 | World Knowledge | —Unverified | 0 |
| CV-Probes: Studying the interplay of lexical and world knowledge in visually grounded verb understanding | Sep 2, 2024 | World Knowledge | —Unverified | 0 |
| Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning | Aug 30, 2024 | Causal Language ModelingContinual Learning | —Unverified | 0 |
| Zero-Shot Visual Reasoning by Vision-Language Models: Benchmarking and Analysis | Aug 27, 2024 | BenchmarkingLarge Language Model | —Unverified | 0 |
| Text2SQL is Not Enough: Unifying AI and Databases with TAG | Aug 27, 2024 | RAGRetrieval-augmented Generation | CodeCode Available | 4 |
| Exploring the Potential of Large Language Models for Heterophilic Graphs | Aug 26, 2024 | Node ClassificationWorld Knowledge | —Unverified | 0 |
| AgentMove: Predicting Human Mobility Anywhere Using Large Language Model based Agentic Framework | Aug 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| To Code, or Not To Code? Exploring Impact of Code in Pre-training | Aug 20, 2024 | Code GenerationWorld Knowledge | —Unverified | 0 |
| Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models | Aug 20, 2024 | Music RecommendationRecommendation Systems | —Unverified | 0 |
| CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation | Aug 20, 2024 | Collaborative FilteringGeneral Knowledge | —Unverified | 0 |
| CoDi: Conversational Distillation for Grounded Question Answering | Aug 20, 2024 | Question AnsweringWorld Knowledge | —Unverified | 0 |
| BLADE: Benchmarking Language Model Agents for Data-Driven Science | Aug 19, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 |
| On the Necessity of World Knowledge for Mitigating Missing Labels in Extreme Classification | Aug 18, 2024 | ImputationMissing Labels | CodeCode Available | 0 |
| A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models | Aug 16, 2024 | Logical Reasoningvalid | —Unverified | 0 |
| MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty | Aug 13, 2024 | Mathematical ReasoningQuestion Answering | CodeCode Available | 0 |
| Prompt Tuning as User Inherent Profile Inference Machine | Aug 13, 2024 | QuantizationRecommendation Systems | —Unverified | 0 |
| LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description | Aug 9, 2024 | DiversityInstruction Following | CodeCode Available | 0 |
| Better Alignment with Instruction Back-and-Forth Translation | Aug 8, 2024 | DiversityTranslation | —Unverified | 0 |
| Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks | Aug 7, 2024 | AttributeIn-Context Learning | CodeCode Available | 2 |
| Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation | Aug 7, 2024 | Logical ReasoningRecommendation Systems | —Unverified | 0 |
| CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge | Jul 30, 2024 | In-Context LearningKnowledge Graphs | —Unverified | 0 |
| SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages | Jul 29, 2024 | DiversityInstruction Following | CodeCode Available | 2 |
| Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models | Jul 28, 2024 | World Knowledge | —Unverified | 0 |
| DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models | Jul 24, 2024 | Retrieval-augmented GenerationWorld Knowledge | CodeCode Available | 0 |
| Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models | Jul 22, 2024 | DisentanglementQuestion Answering | CodeCode Available | 0 |
| Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data | Jul 20, 2024 | Language ModellingMachine Translation | —Unverified | 0 |
| LoFTI: Localization and Factuality Transfer to Indian Locales | Jul 16, 2024 | World Knowledge | CodeCode Available | 0 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities | Jul 10, 2024 | counterfactualFact Checking | CodeCode Available | 1 |
| VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving | Jul 9, 2024 | Autonomous DrivingImage to 3D | —Unverified | 0 |
| Language Representations Can be What Recommenders Need: Findings and Potentials | Jul 7, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 |
| BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization | Jun 30, 2024 | Continual LearningGeneral Knowledge | —Unverified | 0 |
| LLaRA: Supercharging Robot Learning Data for Vision-Language Policy | Jun 28, 2024 | Vision-Language-ActionWorld Knowledge | CodeCode Available | 3 |
| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| Mental Modeling of Reinforcement Learning Agents by Language Models | Jun 26, 2024 | Decision Makingreinforcement-learning | —Unverified | 0 |
| LABOR-LLM: Language-Based Occupational Representations with Large Language Models | Jun 25, 2024 | In-Context LearningJob Prediction | —Unverified | 0 |
| Mitigating Hallucination in Fictional Character Role-Play | Jun 25, 2024 | HallucinationWorld Knowledge | CodeCode Available | 0 |
| Exploring Factual Entailment with NLI: A News Media Study | Jun 24, 2024 | ArticlesFew-Shot Learning | —Unverified | 0 |
| Evaluating the Ability of Large Language Models to Reason about Cardinal Directions | Jun 24, 2024 | World Knowledge | —Unverified | 0 |
| On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models | Jun 24, 2024 | RAGRetrieval | —Unverified | 0 |
| LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments | Jun 24, 2024 | World Knowledge | CodeCode Available | 2 |
| OCALM: Object-Centric Assessment with Language Models | Jun 24, 2024 | ObjectReinforcement Learning (RL) | —Unverified | 0 |
| What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs | Jun 20, 2024 | reinforcement-learningReinforcement Learning | —Unverified | 0 |
| Locating and Extracting Relational Concepts in Large Language Models | Jun 19, 2024 | World Knowledge | CodeCode Available | 0 |
| WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia | Jun 19, 2024 | Language ModellingRAG | —Unverified | 0 |
| Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning | Jun 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| Are Large Language Models True Healthcare Jacks-of-All-Trades? Benchmarking Across Health Professions Beyond Physician Exams | Jun 17, 2024 | AllBenchmarking | CodeCode Available | 0 |
| A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences | Jun 17, 2024 | In-Context Learningvalid | CodeCode Available | 0 |
| RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models | Jun 16, 2024 | Adversarial AttackBenchmarking | CodeCode Available | 2 |