SOTAVerified

Benchmarking

Papers

Showing 35263550 of 5548 papers

TitleStatusHype
N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition0
NTP : A Neural Network Topology Profiler0
Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions0
Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models0
NUMOSIM: A Synthetic Mobility Dataset with Anomaly Detection Benchmarks0
NuwaTS: a Foundation Model Mending Every Incomplete Time Series0
Object Detection based on LIDAR Temporal Pulses using Spiking Neural Networks0
OctoPath: An OcTree Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots0
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking0
Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection0
Off-policy Evaluation for Payments at Adyen0
OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics0
Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking0
Omnibenchmark (alpha) for continuous and open benchmarking in bioinformatics0
OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions0
OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB0
On Benchmarking Code LLMs for Android Malware Analysis0
On Benchmarking Iris Recognition within a Head-mounted Display for AR/VR Application0
On Continual Model Refinement in Out-of-Distribution Data Streams0
On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events0
On Distribution Grid Optimal Power Flow Development and Integration0
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities0
One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision0
One of these (Few) Things is Not Like the Others0
One-Shot Federated Learning with Classifier-Free Diffusion Models0
Show:102550
← PrevPage 142 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified