SOTAVerified

Large Language Model

Papers

Showing 60266050 of 6097 papers

TitleStatusHype
Towards Harnessing Large Language Models for Comprehension of Conversational GroundingCode0
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model LeaderboardsCode0
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language ModelsCode0
Detecting Manipulated Contents Using Knowledge-Grounded InferenceCode0
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual ErrorsCode0
Detecting AI-Generated Texts in Cross-DomainsCode0
Reshaping Free-Text Radiology Notes Into Structured Reports With Generative TransformersCode0
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior SimulationCode0
Chaining thoughts and LLMs to learn DNA structural biophysicsCode0
StructEval: Deepen and Broaden Large Language Model Assessment via Structured EvaluationCode0
CellTypeAgent: Trustworthy cell type annotation with Large Language ModelsCode0
Resolving References in Visually-Grounded Dialogue via Text GenerationCode0
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?Code0
Towards Interpretable Hate Speech Detection using Large Language Model-extracted RationalesCode0
How Benchmark Prediction from Fewer Data Misses the MarkCode0
Summarisation of German Judgments in conjunction with a Class-based EvaluationCode0
SumRec: A Framework for Recommendation using Open-Domain DialogueCode0
HORAE: A Domain-Agnostic Language for Automated Service RegulationCode0
Design Principle Transfer in Neural Architecture Search via Large Language ModelsCode0
CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge CollaborationCode0
HLAT: High-quality Large Language Model Pre-trained on AWS TrainiumCode0
Historical Ink: 19th Century Latin American Spanish Newspaper Corpus with LLM OCR CorrectionCode0
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?Code0
Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized ModelsCode0
When is Off-Policy Evaluation (Reward Modeling) Useful in Contextual Bandits? A Data-Centric PerspectiveCode0
Show:102550
← PrevPage 242 of 244Next →

No leaderboard results yet.