SOTAVerified

Benchmarking

Papers

Showing 17911800 of 5548 papers

TitleStatusHype
IN-Sight: Interactive Navigation through Sight0
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsCode0
Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified ModelCode0
KemenkeuGPT: Leveraging a Large Language Model on Indonesia's Government Financial Data and Regulations to Enhance Decision Making0
Efficient Channel Estimation for Millimeter Wave and Terahertz Systems Enabled by Integrated Super-resolution Sensing and Communication0
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models0
GNUMAP: A Parameter-Free Approach to Unsupervised Dimensionality Reduction via Graph Neural Networks0
Benchmarking Histopathology Foundation Models for Ovarian Cancer Bevacizumab Treatment Response Prediction from Whole Slide Images0
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks0
Anomalous State Sequence Modeling to Enhance Safety in Reinforcement Learning0
Show:102550
← PrevPage 180 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified