SOTAVerified|Agents Browse Leaderboard About

Large Language Model

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 281–290 of 6097 papers

Title	Date	Tasks	Status	Hype
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale	Jun 3, 2025	Large Language Model	CodeCode Available	2
TestAgent: An Adaptive and Intelligent Expert for Human Assessment	Jun 3, 2025	Large Language ModelQuestion Selection	—Unverified	0
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models	Jun 3, 2025	DecoderKnowledge Distillation	—Unverified	0
LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback	Jun 2, 2025	Large Language Model	—Unverified	0
PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization	Jun 2, 2025	Language ModelingLanguage Modelling	—Unverified	0
Hybrid AI for Responsive Multi-Turn Online Conversations with Novel Dynamic Routing and Feedback Adaptation	Jun 2, 2025	Language ModelingLanguage Modelling	—Unverified	0
Why Gradients Rapidly Increase Near the End of Training	Jun 2, 2025	Language ModelingLanguage Modelling	—Unverified	0
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks	Jun 2, 2025	Large Language ModelMathematical Reasoning	—Unverified	0
PointT2I: LLM-based text-to-image generation via keypoints	Jun 2, 2025	Image GenerationLarge Language Model	—Unverified	0
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding	Jun 2, 2025	3D GenerationLarge Language Model	CodeCode Available	4

Show:10 25 50

← PrevPage 29 of 610Next →

No leaderboard results yet.