SOTAVerified

Large Language Model

Papers

Showing 476500 of 6097 papers

TitleStatusHype
Continually Self-Improving Language Models for Bariatric Surgery Question--Answering0
A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial OptimizationCode1
PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models0
CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning0
Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions0
Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine0
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks0
Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification0
Reward Is Enough: LLMs Are In-Context Reinforcement Learners0
Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector0
Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation0
NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction0
CRAKEN: Cybersecurity LLM Agent with Knowledge-Based ExecutionCode1
AutoData: A Multi-Agent System for Open Web Data CollectionCode0
How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following BehaviorCode1
Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition0
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval0
MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling0
LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model EditingCode0
Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling0
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective0
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question AnsweringCode0
X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic SystemCode0
CP-LLM: Context and Pixel Aware Large Language Model for Video Quality Assessment0
Web-Shepherd: Advancing PRMs for Reinforcing Web AgentsCode2
Show:102550
← PrevPage 20 of 244Next →

No leaderboard results yet.