SOTAVerified|Agents Browse Leaderboard About

Large Language Model

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 161–170 of 6097 papers

Title	Date	Tasks	Status	Hype
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs	Jun 13, 2025	Large Language Model	—Unverified	0
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks	Jun 13, 2025	BenchmarkingLarge Language Model	CodeCode Available	2
Investigating the Potential of Large Language Model-Based Router Multi-Agent Architectures for Foundation Design Automation: A Task Classification and Expert Selection Study	Jun 13, 2025	Language ModelingLanguage Modelling	—Unverified	0
Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models	Jun 12, 2025	Large Language ModelOptical Character Recognition	—Unverified	0
LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic	Jun 12, 2025	Large Language ModelPrompt Engineering	CodeCode Available	0
Nowcasting the euro area with social media data	Jun 12, 2025	Language ModelingLanguage Modelling	—Unverified	0
MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices	Jun 12, 2025	CPUGPU	—Unverified	0
Grounded Vision-Language Navigation for UAVs with Open-Vocabulary Goal Understanding	Jun 12, 2025	Language ModelingLanguage Modelling	—Unverified	0
Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills	Jun 12, 2025	Large Language ModelTask Planning	—Unverified	0
Unsourced Adversarial CAPTCHA: A Bi-Phase Adversarial CAPTCHA Framework	Jun 12, 2025	Adversarial AttackDiversity	—Unverified	0

Show:10 25 50

← PrevPage 17 of 610Next →

No leaderboard results yet.