SOTAVerified

Benchmarking

Papers

Showing 36763700 of 5548 papers

TitleStatusHype
Benchmarking for Metaheuristic Black-Box Optimization: Perspectives and Open Challenges0
GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents0
Towards Personalized Federated Learning0
MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design0
Towards Private Learning on Decentralized Graphs with Local Differential Privacy0
MOLTR: Multiple Object Localisation, Tracking, and Reconstruction from Monocular RGB Videos0
Benchmarking for Bayesian Reinforcement Learning0
Towards Productionizing Subjective Search Systems0
Momentum Contrastive Pre-training for Question Answering0
Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling0
Benchmarking fixed-length Fingerprint Representations across different Embedding Sizes and Sensor Types0
MorisienMT: A Dataset for Mauritian Creole Machine Translation0
Morphing Attack Detection -- Database, Evaluation Platform and Benchmarking0
MORSE: Semantic-ally Drive-n MORpheme SEgment-er0
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models0
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level0
A Dataset for Benchmarking Image-Based Localization0
Movie Description0
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning0
Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking0
MozzaVID: Mozzarella Volumetric Image Dataset0
MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning0
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures0
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization0
Towards responsible AI for education: Hybrid human-AI to confront the Elephant in the room0
Show:102550
← PrevPage 148 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified