MedExQA: Medical Question Answering Benchmark with Multiple Explanations Jun 10, 2024 Medical Question Answering Question Answering
Code Code Available 0Zero-Shot End-To-End Spoken Question Answering In Medical Domain Jun 9, 2024 Answer Selection Question Answering
— Unverified 0Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses Jun 9, 2024 Question Answering Semantic Similarity
— Unverified 0RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation Jun 9, 2024 Document Ranking Natural Questions
Code Code Available 0MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model Jun 9, 2024 Information Retrieval Learning-To-Rank
— Unverified 0MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering Jun 9, 2024 Question Answering
— Unverified 0CERET: Cost-Effective Extrinsic Refinement for Text Generation Jun 8, 2024 Abstractive Text Summarization Question Answering
Code Code Available 0Towards a Benchmark for Causal Business Process Reasoning with LLMs Jun 8, 2024 Question Answering
Code Code Available 0CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation Jun 8, 2024 Open-Domain Question Answering Question Answering
— Unverified 0Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation Jun 8, 2024 Abstractive Text Summarization Dialogue Generation
— Unverified 0Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts Jun 8, 2024 Machine Translation Multiple-choice
— Unverified 0Venn Diagram Prompting : Accelerating Comprehension with Scaffolding Effect Jun 8, 2024 Question Answering
— Unverified 0MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources Jun 7, 2024 Language Modeling Language Modelling
— Unverified 0TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models Jun 7, 2024 Medical Question Answering Question Answering
— Unverified 0CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models Jun 7, 2024 Multiple-choice Philosophy
Code Code Available 0On Subjective Uncertainty Quantification and Calibration in Natural Language Generation Jun 7, 2024 In-Context Learning Machine Translation
Code Code Available 0Composition Vision-Language Understanding via Segment and Depth Anything Model Jun 7, 2024 Question Answering Visual Question Answering (VQA)
Code Code Available 0Understanding Information Storage and Transfer in Multi-modal Large Language Models Jun 6, 2024 Factual Visual Question Answering Model Editing
— Unverified 0FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages Jun 6, 2024 Answer Generation Question Answering
Code Code Available 0Efficient Knowledge Infusion via KG-LLM Alignment Jun 6, 2024 Knowledge Graphs Question Answering
— Unverified 0Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? Jun 6, 2024 Multiple-choice Question Answering
— Unverified 0M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering Jun 6, 2024 abstractive question answering Clinical Knowledge
Code Code Available 0Synthesizing Conversations from Unlabeled Documents using Automatic Response Segmentation Jun 6, 2024 Conversational Question Answering Question Answering
— Unverified 0IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models Jun 5, 2024 Mathematical Reasoning Natural Language Inference
— Unverified 0Measuring Retrieval Complexity in Question Answering Systems Jun 5, 2024 Question Answering Retrieval
— Unverified 0Discovering Bias in Latent Space: An Unsupervised Debiasing Approach Jun 5, 2024 Question Answering
— Unverified 0Balancing Performance and Efficiency in Zero-shot Robotic Navigation Jun 5, 2024 Computational Efficiency Question Answering
— Unverified 0CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs Jun 5, 2024 Clustering Natural Language Inference
Code Code Available 0UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models Jun 4, 2024 Graph Question Answering Question Answering
— Unverified 0I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering Jun 4, 2024 Question Answering
— Unverified 0Chain of Agents: Large Language Models Collaborating on Long-Context Tasks Jun 4, 2024 Code Completion Long-Form Narrative Summarization
— Unverified 0ACCORD: Closing the Commonsense Measurability Gap Jun 4, 2024 Benchmarking Common Sense Reasoning
Code Code Available 0Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor Jun 4, 2024 Question Answering RAG
— Unverified 0Multimodal Reasoning with Multimodal Knowledge Graph Jun 4, 2024 cross-modal alignment Graph Attention
— Unverified 0On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept Jun 4, 2024 Question Answering Safety Alignment
— Unverified 0Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering Jun 4, 2024 Data Augmentation Machine Translation
— Unverified 0Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following Jun 4, 2024 Question Answering Visual Question Answering
Code Code Available 0Analyzing Social Biases in Japanese Large Language Models Jun 4, 2024 Question Answering
Code Code Available 0Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges Jun 4, 2024 Question Answering Story Generation
— Unverified 0Ask-EDA: A Design Assistant Empowered by LLM, Hybrid RAG and Abbreviation De-hallucination Jun 3, 2024 Hallucination Question Answering
— Unverified 0Utilizing Large Language Models for Automating Technical Customer Support Jun 3, 2024 Question Answering
— Unverified 0Graph Neural Network Enhanced Retrieval for Question Answering of LLMs Jun 3, 2024 Graph Neural Network Language Modelling
— Unverified 0Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph Jun 3, 2024 Knowledge Graphs Multiple-choice
— Unverified 0Mixture of Rationale: Multi-Modal Reasoning Mixture for Visual Question Answering Jun 3, 2024 Diversity Question Answering
— Unverified 0Selectively Answering Visual Questions Jun 3, 2024 Avg In-Context Learning
— Unverified 0EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs Jun 3, 2024 Knowledge Graphs Question Answering
— Unverified 0MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering Jun 3, 2024 Medical Question Answering MedQA
— Unverified 0Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation Jun 3, 2024 Question Answering Text Generation
Code Code Available 0Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study Jun 3, 2024 Chatbot Language Modeling
— Unverified 0CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems Jun 2, 2024 Question Answering
Code Code Available 0