SOTAVerified

Benchmarking

Papers

Showing 34313440 of 5548 papers

TitleStatusHype
Movie Description0
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning0
Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking0
MozzaVID: Mozzarella Volumetric Image Dataset0
MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning0
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures0
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization0
MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception0
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime0
MSQA: Benchmarking LLMs on Graduate-Level Materials Science Reasoning and Knowledge0
Show:102550
← PrevPage 344 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified