SOTAVerified|Agents Browse Leaderboard About Blog

2k

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 288 papers

Title	Date	Tasks	Status	Hype
Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions	May 23, 2025	2kBenchmarking	CodeCode Available	1
PIIvot: A Lightweight NLP Anonymization Framework for Question-Anchored Tutoring Dialogues	May 22, 2025	2k	—Unverified	0
Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning	May 19, 2025	2kMathematical Reasoning	—Unverified	0
UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning	May 18, 2025	2kReinforcement Learning (RL)	—Unverified	0
ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation	May 12, 2025	2kRecommendation Systems	CodeCode Available	0
Calibrating Translation Decoding with Quality Estimation on LLMs	Apr 26, 2025	2kMachine Translation	CodeCode Available	0
aiXamine: Simplified LLM Safety and Security	Apr 21, 2025	2kAdversarial Robustness	—Unverified	0
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis	Apr 20, 2025	2kKnowledge Distillation	—Unverified	0
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading	Apr 16, 2025	2kCode Generation	—Unverified	0
On Linear Representations and Pretraining Data Frequency in Language Models	Apr 16, 2025	2kIn-Context Learning	—Unverified	0

Show:10 25 50

← PrevPage 2 of 29Next →

No leaderboard results yet.