SOTAVerified

Large Language Model

Papers

Showing 16761700 of 6097 papers

TitleStatusHype
LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based RankingCode0
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy OptimizationCode0
LLM-enhanced Self-training for Cross-domain Constituency ParsingCode0
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?Code0
DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language ModelsCode0
Self-Bootstrapped Visual-Language Model for Knowledge Selection and Question AnsweringCode0
LLM-GEm: Large Language Model-Guided Prediction of People’s Empathy Levels towards Newspaper ArticleCode0
Benchmarking Large Language Model Uncertainty for Prompt OptimizationCode0
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General IntelligenceCode0
Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language ModelsCode0
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language ModelsCode0
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent EcosystemCode0
Detecting Manipulated Contents Using Knowledge-Grounded InferenceCode0
NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI TutorsCode0
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language ModelCode0
LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic SurgeryCode0
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual ErrorsCode0
Detecting AI-Generated Texts in Cross-DomainsCode0
AIOS: LLM Agent Operating SystemCode0
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy LogicCode0
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine FeedbackCode0
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?Code0
LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial DescriptionCode0
PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language ModelsCode0
Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified ModelCode0
Show:102550
← PrevPage 68 of 244Next →

No leaderboard results yet.