SOTAVerified

Open-Ended Question Answering

Open-ended questions are defined as those that simply pose the question, without imposing any constraints on the format of the response. This distinguishes them from questions with a predetermined answer format.

Papers

Showing 125 of 796 papers

TitleStatusHype
VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks0
WIP: Large Language Model-Enhanced Smart Tutor for Undergraduate Circuit Analysis0
anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding0
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data SynthesisCode0
O^2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question AnsweringCode1
TinyRS-R1: Compact Multimodal Language Model for Remote Sensing0
Ranked Voting based Self-Consistency of Large Language ModelsCode1
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making0
Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Reliable Response Generation in the Wild0
AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models0
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement0
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real UsersCode1
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language ModelsCode0
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering0
TVBench: Redesigning Video-Language Evaluation0
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction TuningCode0
Video Instruction Tuning With Synthetic Data0
CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks0
Ranking Generated Answers: On the Agreement of Retrieval Models with Humans on Consumer Health QuestionsCode0
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form Text0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models0
LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone SensorsCode1
Extrinsic Evaluation of Cultural Competence in Large Language ModelsCode0
SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response RankingCode1
Show:102550
← PrevPage 1 of 32Next →

No leaderboard results yet.