AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM Dec 2, 2024 Instruction Following Question Answering
— Unverified 0Eyes on the Road: State-of-the-Art Video Question Answering Models Assessment for Traffic Monitoring Tasks Dec 2, 2024 Multi-Object Tracking Object Tracking
Code Code Available 0SEAL: Semantic Attention Learning for Long Video Representation Dec 2, 2024 Diversity Question Answering
— Unverified 0Unlocking Video-LLM via Agent-of-Thoughts Distillation Dec 2, 2024 Language Modeling Language Modelling
— Unverified 0Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking Dec 2, 2024 Benchmarking Decision Making
— Unverified 0Understanding the World's Museums through Vision-Language Reasoning Dec 2, 2024 Benchmarking Question Answering
Code Code Available 0Mastering Board Games by External and Internal Planning with Language Models Dec 2, 2024 Board Games Language Modeling
— Unverified 0Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages Dec 1, 2024 ARC Multiple-choice
— Unverified 0KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting Dec 1, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 0Learn to Unlearn: Meta-Learning-Based Knowledge Graph Embedding Unlearning Dec 1, 2024 Graph Embedding Knowledge Graph Embedding
— Unverified 0Generative Language Models Potential for Requirement Engineering Applications: Insights into Current Strengths and Limitations Dec 1, 2024 NER Prompt Engineering
— Unverified 0Improving Vietnamese Legal Document Retrieval using Synthetic Data Dec 1, 2024 Information Retrieval Question Answering
— Unverified 0DynRank: Improving Passage Retrieval with Dynamic Zero-Shot Prompting Based on Question Classification Nov 30, 2024 Open-Domain Question Answering Passage Retrieval
— Unverified 0Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark Nov 29, 2024 Benchmarking Grounded Video Question Answering
— Unverified 0Actions and Objects Pathways for Domain Adaptation in Video Question Answering Nov 29, 2024 Domain Adaptation Domain Generalization
— Unverified 0SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks Nov 29, 2024 Question Answering Visual Question Answering
Code Code Available 0STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training Nov 29, 2024 Question Answering Video Understanding
— Unverified 0Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs Nov 29, 2024 All Chatbot
— Unverified 0DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness Nov 29, 2024 Optical Character Recognition (OCR) Question Answering
Code Code Available 0TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension Nov 29, 2024 8k Question Answering
Code Code Available 0COLD: Causal reasOning in cLosed Daily activities Nov 29, 2024 Causal Inference Commonsense Causal Reasoning
Code Code Available 0Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs Nov 28, 2024 Attribute Hallucination
— Unverified 0DIESEL -- Dynamic Inference-Guidance via Evasion of Semantic Embeddings in LLMs Nov 28, 2024 Question Answering Reranking
— Unverified 0Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Nov 28, 2024 Image Captioning image-classification
— Unverified 0Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation Nov 27, 2024 Information Retrieval Part-Of-Speech Tagging
— Unverified 0Can bidirectional encoder become the ultimate winner for downstream applications of foundation models? Nov 27, 2024 Language Modeling Language Modelling
— Unverified 0Active Data Curation Effectively Distills Large-Scale Multimodal Models Nov 27, 2024 Decoder Image Captioning
— Unverified 0ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering? Nov 27, 2024 Question Answering Visual Question Answering
— Unverified 0DRS: Deep Question Reformulation With Structured Output Nov 27, 2024 Question Answering
Code Code Available 03D Scene Graph Guided Vision-Language Pre-training Nov 27, 2024 3D dense captioning 3D visual grounding
— Unverified 0GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images Nov 27, 2024 Question Answering whole slide images
Code Code Available 0Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track Nov 27, 2024 Medical Question Answering Question Answering
— Unverified 0HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation Nov 27, 2024 Graph Generation Question Answering
— Unverified 0SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation Nov 27, 2024 Question Answering Speech Enhancement
— Unverified 0Task Progressive Curriculum Learning for Robust Visual Question Answering Nov 26, 2024 Data Augmentation Ensemble Learning
— Unverified 0Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey Nov 26, 2024 Natural Language Understanding Question Answering
— Unverified 0Efficient Multi-modal Large Language Models via Visual Token Grouping Nov 26, 2024 Image Captioning Question Answering
— Unverified 0GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis Nov 25, 2024 Medical Visual Question Answering Multiple-choice
— Unverified 0VideoOrion: Tokenizing Object Dynamics in Videos Nov 25, 2024 Language Modeling Language Modelling
— Unverified 0Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering Nov 24, 2024 Question Answering Relational Reasoning
— Unverified 0freePruner: A Training-free Approach for Large Multimodal Model Acceleration Nov 23, 2024 Quantization Question Answering
— Unverified 0AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset Nov 23, 2024 Language Modeling Language Modelling
— Unverified 0Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents Nov 23, 2024 Question Answering RAG
Code Code Available 0ReWind: Understanding Long Videos with Instructed Learnable Memory Nov 23, 2024 Large Language Model Question Answering
— Unverified 0FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity Nov 23, 2024 Attribute Cross-Modal Retrieval
— Unverified 0KBAlign: Efficient Self Adaptation on Specific Knowledge Bases Nov 22, 2024 Question Answering RAG
Code Code Available 0Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective Nov 21, 2024 Knowledge Graphs Question Answering
— Unverified 0Visual Contexts Clarify Ambiguous Expressions: A Benchmark Dataset Nov 21, 2024 Question Answering Visual Grounding
Code Code Available 0FastRAG: Retrieval Augmented Generation for Semi-structured Data Nov 21, 2024 Management Question Answering
— Unverified 0Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios Nov 20, 2024 Question Answering Visual Question Answering (VQA)
— Unverified 0