Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform Jan 1, 2025 Code Generation Image Generation
— Unverified 0LLM-MedQA: Enhancing Medical Question Answering through Case Studies in Large Language Models Dec 31, 2024 Medical Question Answering MedQA
— Unverified 0A review of faithfulness metrics for hallucination assessment in Large Language Models Dec 31, 2024 Benchmarking Hallucination
— Unverified 0Probing Visual Language Priors in VLMs Dec 31, 2024 Question Answering Visual Question Answering
— Unverified 0EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta Dec 31, 2024 Multiple-choice Question Answering
— Unverified 0OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning Dec 31, 2024 Benchmarking Logical Reasoning
Code Code Available 4Dual Diffusion for Unified Image Generation and Understanding Dec 31, 2024 Image Generation Language Modeling
Code Code Available 2Online Video Understanding: OVBench and VideoChat-Online Dec 31, 2024 Autonomous Driving Question Answering
Code Code Available 2An Empirical Evaluation of Large Language Models on Consumer Health Questions Dec 31, 2024 Medical Question Answering Question Answering
— Unverified 0MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Dec 31, 2024 Multiple-choice Question Answering
Code Code Available 0FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models Dec 30, 2024 Question Answering Token Reduction
Code Code Available 2Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs' Memory Dec 30, 2024 Question Answering
— Unverified 0UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models Dec 30, 2024 Question Answering Scene Classification
Code Code Available 0Plug-and-Play Training Framework for Preference Optimization Dec 30, 2024 Mathematical Reasoning Question Answering
— Unverified 0MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets Dec 30, 2024 Question Answering
Code Code Available 0KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation Dec 30, 2024 Decision Making Graph Question Answering
— Unverified 0Enhancing Table Recognition with Vision LLMs: A Benchmark and Neighbor-Guided Toolchain Reasoner Dec 30, 2024 Question Answering Table Recognition
Code Code Available 1Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
— Unverified 0WalkVLM:Aid Visually Impaired People Walking by Vision Language Model Dec 30, 2024 Language Modeling Language Modelling
— Unverified 0Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering Dec 30, 2024 Image Captioning Object Recognition
— Unverified 0Audiopedia: Audio QA with Knowledge Dec 29, 2024 Audio Question Answering Entity Linking
Code Code Available 0HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models Dec 29, 2024 Hallucination Object
Code Code Available 0Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering Dec 28, 2024 Question Answering
— Unverified 0Building a Rich Dataset to Empower the Persian Question Answering Systems Dec 28, 2024 Question Answering
— Unverified 0Long Context vs. RAG for LLMs: An Evaluation and Revisits Dec 27, 2024 Question Answering RAG
Code Code Available 1Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering Dec 27, 2024 Question Answering Representation Learning
— Unverified 0ErgoChat: a Visual Query System for the Ergonomic Risk Assessment of Construction Workers Dec 27, 2024 Image Captioning Question Answering
— Unverified 0Text2Insight: Transform natural language text into insights seamlessly using multi-model architecture Dec 27, 2024 named-entity-recognition Named Entity Recognition
— Unverified 0TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data Dec 27, 2024 In-Context Learning Knowledge Base Question Answering
— Unverified 0Interacted Object Grounding in Spatio-Temporal Human-Object Interactions Dec 27, 2024 Human-Object Interaction Detection Object
Code Code Available 1Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries Dec 26, 2024 Question Answering Video Question Answering
— Unverified 0Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation Dec 25, 2024 Open-Domain Question Answering Question Answering
— Unverified 0Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation Dec 24, 2024 Graph Question Answering Hallucination
Code Code Available 1TextMatch: Enhancing Image-Text Consistency Through Multimodal Optimization Dec 24, 2024 In-Context Learning Question Answering
— Unverified 0Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control Dec 24, 2024 Question Answering
— Unverified 0Unlocking the Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks Dec 24, 2024 Question Answering Reading Comprehension
— Unverified 0HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images Dec 24, 2024 Optical Character Recognition (OCR) Question Answering
— Unverified 0Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models Dec 24, 2024 Question Answering Video Question Answering
Code Code Available 2Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models Dec 24, 2024 Machine Translation Molecular Property Prediction
Code Code Available 1LININ: Logic Integrated Neural Inference Network for Explanatory Visual Question Answering Dec 24, 2024 Explanatory Visual Question Answering Multimodal Reasoning
Code Code Available 0CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era Dec 24, 2024 Knowledge Base Question Answering Knowledge Graphs
Code Code Available 1Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering Dec 24, 2024 Question Answering Visual Question Answering
— Unverified 0LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating Dec 24, 2024 document understanding Question Answering
Code Code Available 1GeAR: Graph-enhanced Agent for Retrieval-augmented Generation Dec 24, 2024 Multi-hop Question Answering Question Answering
— Unverified 0Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations Dec 23, 2024 Benchmarking Question Answering
— Unverified 0From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering Dec 23, 2024 Question Answering
Code Code Available 0RAGONITE: Iterative Retrieval on Induced Databases and Verbalized RDF for Conversational QA over KGs with RAG Dec 23, 2024 Conversational Question Answering Knowledge Graphs
— Unverified 0Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy Dec 23, 2024 Image Captioning Question Answering
— Unverified 0Multimodal Preference Data Synthetic Alignment with Reward Model Dec 23, 2024 2k Caption Generation
Code Code Available 0VidCtx: Context-aware Video Question Answering with Image Models Dec 23, 2024 Large Language Model Question Answering
Code Code Available 0