SOTAVerified

Benchmarking

Papers

Showing 38513875 of 5548 papers

TitleStatusHype
RISEdb: a Novel Indoor Localization Dataset0
Risk Aware Benchmarking of Large Language Models0
Risk-Neutral Generative Networks0
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations0
RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies0
RNAmountAlign: efficient software for local, global, semiglobal pairwise and multiple RNA sequence/structure alignment0
A Comprehensive Guide to CAN IDS Data & Introduction of the ROAD Dataset0
ROBBIE: Robust Bias Evaluation of Large Generative Language Models0
OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images0
Robust 2D/3D Vehicle Parsing in CVIS0
Robust measurement of innovation performances in Europe with a hierarchy of interacting composite indicators0
Robust Medical Instrument Segmentation Challenge 20190
RobustMQ: Benchmarking Robustness of Quantized Models0
Robustness of Reinforcement Learning-Based Traffic Signal Control under Incidents: A Comparative Study0
Robust Salient Object Detection on Compressed Images Using Convolutional Neural Networks0
RobustSpring: Benchmarking Robustness to Image Corruptions for Optical Flow, Scene Flow and Stereo0
Robust Vision Challenge 2020 -- 1st Place Report for Panoptic Segmentation0
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands0
RRSIS: Referring Remote Sensing Image Segmentation0
RT-Pose: A 4D Radar Tensor-based 3D Human Pose Estimation and Localization Benchmark0
Rule-based Data Selection for Large Language Models0
RxRx3-core: Benchmarking drug-target interactions in High-Content Microscopy0
Sadeed: Advancing Arabic Diacritization Through Small Language Model0
Safe Load Balancing in Software-Defined-Networking0
SAIBench: A Structural Interpretation of AI for Science Through Benchmarks0
Show:102550
← PrevPage 155 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified