SOTAVerified

Benchmarking

Papers

Showing 30513075 of 5548 papers

TitleStatusHype
Evaluating Music Recommender Systems for Groups0
Evaluating Nuanced Bias in Large Language Model Free Response Answers0
Evaluating Robustness of LLMs on Crisis-Related Microblogs across Events, Information Types, and Linguistic Features0
Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning0
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance0
Evaluating the Generation of Spatial Relations in Text and Image Generative Models0
Evaluating the Performance of Large Language Models via Debates0
Evaluating Visual Conversational Agents via Cooperative Human-AI Games0
Evaluation and Ensembling of Methods for Reverse Engineering of Brain Connectivity from Imaging Data0
Evaluation Methodology for Attacks Against Confidence Thresholding Models0
Evaluation Methods and Measures for Causal Learning Algorithms0
Evaluation of Algorithms for Multi-Modality Whole Heart Segmentation: An Open-Access Grand Challenge0
Evaluation of Architectural Synthesis Using Generative AI0
Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi0
Evaluation of Popular XAI Applied to Clinical Prediction Models: Can They be Trusted?0
Evaluation of simulation methods for tumor subclonal reconstruction0
Evaluation of Three Welsh Language POS Taggers0
EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation0
EventAid: Benchmarking Event-aided Image/Video Enhancement Algorithms with Real-captured Hybrid Dataset0
Event-based Continuous Color Video Decompression from Single Frames0
Event-based Feature Extraction Using Adaptive Selection Thresholds0
Event Camera Simulator Design for Modeling Attention-based Inference Architectures0
Eventprop training for efficient neuromorphic applications0
EvEntS ReaLM: Event Reasoning of Entity States via Language Models0
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation0
Show:102550
← PrevPage 123 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified