SOTAVerified

Benchmarking

Papers

Showing 25262550 of 5548 papers

TitleStatusHype
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO0
Benchmarking Robot Manipulation with the Rubik's Cube0
A Comprehensive Multi-Illuminant Dataset for Benchmarking of the Intrinsic Image Algorithms0
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness0
A Systematic Analysis of Hybrid Linear Attention0
Benchmarking Retrieval-Augmented Generation for Chemistry0
Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences0
Airport Capacity and Performance in Europe -- A study of transport economics, service quality and sustainability0
Benchmarking Resource Usage for Efficient Distributed Deep Learning0
Goal-Driven Sequential Data Abstraction0
A Survey on Vision Autoregressive Model0
A Survey on Temporal Sentence Grounding in Videos0
Benchmarking Reinforcement Learning Methods for Dexterous Robotic Manipulation with a Three-Fingered Gripper0
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions0
Domain Adaptation with Joint Learning for Generic, Optical Car Part Recognition and Detection Systems (Go-CaRD)0
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models0
Graph Alignment for Benchmarking Graph Neural Networks and Learning Positional Encodings0
Greening AI-enabled Systems with Software Engineering: A Research Agenda for Environmentally Sustainable AI Practices0
Helsinki Deblur Challenge 2021: description of photographic data0
A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams0
A Survey on Preserving Fairness Guarantees in Changing Environments0
Benchmarking Reasoning Robustness in Large Language Models0
Benchmarking real-time monitoring strategies for ethanol production from lignocellulosic biomass0
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods0
Feasibility of BERT Embeddings For Domain-Specific Knowledge Mining0
Show:102550
← PrevPage 102 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified