SOTAVerified

Open-Ended Question Answering

Open-ended questions are defined as those that simply pose the question, without imposing any constraints on the format of the response. This distinguishes them from questions with a predetermined answer format.

Papers

Showing 150 of 796 papers

TitleStatusHype
VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks0
WIP: Large Language Model-Enhanced Smart Tutor for Undergraduate Circuit Analysis0
anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding0
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data SynthesisCode0
O^2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question AnsweringCode1
TinyRS-R1: Compact Multimodal Language Model for Remote Sensing0
Ranked Voting based Self-Consistency of Large Language ModelsCode1
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making0
Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Reliable Response Generation in the Wild0
AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models0
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement0
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real UsersCode1
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language ModelsCode0
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering0
TVBench: Redesigning Video-Language Evaluation0
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction TuningCode0
Video Instruction Tuning With Synthetic Data0
CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks0
Ranking Generated Answers: On the Agreement of Retrieval Models with Humans on Consumer Health QuestionsCode0
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form Text0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models0
LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone SensorsCode1
Extrinsic Evaluation of Cultural Competence in Large Language ModelsCode0
SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response RankingCode1
Long Story Short: Story-level Video Understanding from 20K Short Films0
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering0
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam GenerationCode2
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained EvaluationCode1
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQCode0
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access0
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question AnsweringCode4
BiMediX: Bilingual Medical Mixture of Experts LLMCode1
Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering0
Shai: A large language model for asset management0
On Early Detection of Hallucinations in Factual Question AnsweringCode1
Universal Self-Consistency for Large Language Model Generation0
Downstream Trade-offs of a Family of Text WatermarksCode0
Monolingual or Multilingual Instruction Tuning: Which Makes a Better AlpacaCode0
Prompting Large Language Models with Speech Recognition Abilities0
PRD: Peer Rank and Discussion Improve Large Language Model based EvaluationsCode1
On the Model-Misspecification in Reinforcement Learning0
2D-Shapley: A Framework for Fragmented Data ValuationCode0
Adversaries with Limited Information in the Friedkin--Johnsen ModelCode0
POP: Prompt Of Prompts for Continual Learning0
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language ModelsCode2
Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks0
Non-autoregressive Conditional Diffusion Models for Time Series Prediction0
Benchmarking Foundation Models with Language-Model-as-an-Examiner0
Differences in boundary behavior in the 3D vertex and Voronoi models0
Show:102550
← PrevPage 1 of 16Next →

No leaderboard results yet.