SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2821–2830 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Jun 8, 2024	BenchmarkingInstance Segmentation	—Unverified	0	0
Better Bill GPT: Comparing Large Language Models against Legal Invoice Reviewers	Apr 2, 2025	BenchmarkingManagement	—Unverified	0	0
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures	Jun 6, 2025	BenchmarkingCPU	—Unverified	0	0
The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal	Sep 12, 2024	BenchmarkingLanguage Modeling	—Unverified	0	0
Best Practices in Pool-based Active Learning for Image Classification	Sep 29, 2021	Active LearningBenchmarking	—Unverified	0	0
Abasy Atlas v2.2: The most comprehensive and up-to-date inventory of meta-curated, historical, bacterial regulatory networks, their completeness and system-level characterization	May 5, 2020	Benchmarking	—Unverified	0	0
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach	Apr 27, 2025	BenchmarkingDecision Making	—Unverified	0	0
BERT-GT: Cross-sentence n-ary relation extraction with BERT and Graph Transformer	Jan 11, 2021	BenchmarkingBinary Relation Extraction	—Unverified	0	0
Greening AI-enabled Systems with Software Engineering: A Research Agenda for Environmentally Sustainable AI Practices	Jun 2, 2025	Benchmarking	—Unverified	0	0
Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC	Jan 15, 2021	BenchmarkingLanguage Modeling	—Unverified	0	0

Show:10 25 50

← PrevPage 283 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified