SOTAVerified

Benchmarking

Papers

Showing 21712180 of 5548 papers

TitleStatusHype
Full-stack evaluation of Machine Learning inference workloads for RISC-V systems0
Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks0
MCDFN: Supply Chain Demand Forecasting via an Explainable Multi-Channel Data Fusion Network Model0
Analog or Digital In-memory Computing? Benchmarking through Quantitative ModelingCode1
S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language ModelsCode2
An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models0
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous AgentsCode4
GCondenser: Benchmarking Graph CondensationCode1
A Gap in Time: The Challenge of Processing Heterogeneous IoT Data in Digitalized Buildings0
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models0
Show:102550
← PrevPage 218 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified