RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector Dec 13, 2024 In-Context Learning Question Answering
Code Code Available 1Unifying AI Tutor Evaluation: An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors Dec 12, 2024 Question Answering
Code Code Available 1IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents Dec 10, 2024 Cross-Modal Retrieval Image Classification
Code Code Available 1LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations Dec 9, 2024 Language Modeling Language Modelling
Code Code Available 1RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of Experts Dec 7, 2024 Change Detection Image Comprehension
Code Code Available 1KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models Dec 7, 2024 Multi-hop Question Answering Navigate
Code Code Available 1CharacterBox: Evaluating the Role-Playing Capabilities of LLMs in Text-Based Virtual Worlds Dec 7, 2024 Question Answering
Code Code Available 1Gracefully Filtering Backdoor Samples for Generative Large Language Models without Retraining Dec 3, 2024 backdoor defense Computational Efficiency
Code Code Available 1GraphOTTER: Evolving LLM-based Graph Reasoning for Complex Table Question Answering Dec 2, 2024 Question Answering
Code Code Available 1PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos Dec 2, 2024 Question Answering Video Understanding
Code Code Available 1PerLA: Perceptive 3D Language Assistant Nov 29, 2024 Dense Captioning Graph Neural Network
Code Code Available 1Cross-modal Information Flow in Multimodal Large Language Models Nov 27, 2024 Question Answering Visual Question Answering
Code Code Available 1VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format Nov 27, 2024 Dense Video Captioning Grounded Video Question Answering
Code Code Available 1g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks Nov 26, 2024 Contrastive Learning Question Answering
Code Code Available 1AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning Nov 25, 2024 Hallucination Question Answering
Code Code Available 1Context Awareness Gate For Retrieval Augmented Generation Nov 25, 2024 Open-Domain Question Answering Question Answering
Code Code Available 1Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai Nov 23, 2024 Diversity Question Answering
Code Code Available 1Teaching VLMs to Localize Specific Objects from In-context Examples Nov 20, 2024 Object Object Tracking
Code Code Available 1A Survey of Medical Vision-and-Language Applications and Their Techniques Nov 19, 2024 Decision Making Diagnostic
Code Code Available 1BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation Nov 17, 2024 Action Recognition backdoor defense
Code Code Available 1Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework Nov 14, 2024 Question Answering RAG
Code Code Available 1Controllable Context Sensitivity and the Knob Behind It Nov 11, 2024 Question Answering Retrieval-augmented Generation
Code Code Available 1DELIFT: Data Efficient Language model Instruction Fine Tuning Nov 7, 2024 Language Modeling Language Modelling
Code Code Available 1MEG: Medical Knowledge-Augmented Large Language Models for Question Answering Nov 6, 2024 Knowledge Graph Embeddings Multiple-choice
Code Code Available 1MILU: A Multi-task Indic Language Understanding Benchmark Nov 4, 2024 Multiple-choice Question Answering
Code Code Available 1Rationale-Guided Retrieval Augmented Generation for Medical Question Answering Nov 1, 2024 Medical Question Answering Question Answering
Code Code Available 1Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula Nov 1, 2024 Computational Efficiency Question Answering
Code Code Available 1Show Me What and Where has Changed? Question Answering and Grounding for Remote Sensing Change Detection Oct 31, 2024 Change Detection Question Answering
Code Code Available 1Nearest Neighbor Normalization Improves Multimodal Retrieval Oct 31, 2024 Cross-Modal Retrieval Image Captioning
Code Code Available 1Distinguishing Ignorance from Error in LLM Hallucinations Oct 29, 2024 Hallucination Question Answering
Code Code Available 1Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning Oct 25, 2024 All Computational Efficiency
Code Code Available 1DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Oct 24, 2024 Instruction Following Question Answering
Code Code Available 1Large Language Models Reflect the Ideology of their Creators Oct 24, 2024 Question Answering Text Summarization
Code Code Available 1Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective Oct 23, 2024 graph construction Knowledge Graphs
Code Code Available 1VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning Oct 23, 2024 Question Answering Speech Recognition
Code Code Available 1ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning Oct 23, 2024 Image Captioning Instruction Following
Code Code Available 1Progressive Compositionality In Text-to-Image Generative Models Oct 22, 2024 Attribute Contrastive Learning
Code Code Available 1BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression Oct 20, 2024 In-Context Learning Long-Context Understanding
Code Code Available 1Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning Oct 18, 2024 Hallucination Knowledge Base Question Answering
Code Code Available 1MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems Oct 18, 2024 Benchmarking Question Answering
Code Code Available 1WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Oct 16, 2024 Question Answering Visual Question Answering
Code Code Available 1VividMed: Vision Language Model with Versatile Visual Grounding for Medicine Oct 16, 2024 Language Modeling Language Modelling
Code Code Available 1RuleRAG: Rule-guided retrieval-augmented generation with language models for question answering Oct 15, 2024 In-Context Learning Instruction Following
Code Code Available 1TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models Oct 14, 2024 2k Benchmarking
Code Code Available 1Towards Foundation Models for 3D Vision: How Close Are We? Oct 14, 2024 Question Answering Visual Question Answering
Code Code Available 1Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answering Oct 12, 2024 Answer Generation Blocking
Code Code Available 1Skipping Computations in Multimodal LLMs Oct 12, 2024 Question Answering Visual Question Answering
Code Code Available 1Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping Oct 11, 2024 MME Question Answering
Code Code Available 1SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models Oct 11, 2024 Few-Shot Learning Multiple-choice
Code Code Available 1StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models Oct 10, 2024 Question Answering Reinforcement Learning (RL)
Code Code Available 1