SOTAVerified

Benchmarking

Papers

Showing 38263850 of 5548 papers

TitleStatusHype
Revisiting Hate Speech Benchmarks: From Data Curation to System DeploymentCode0
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?Code0
HySpecNet-11k: A Large-Scale Hyperspectral Dataset for Benchmarking Learning-Based Hyperspectral Image Compression Methods0
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects0
Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesCode0
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context LearningCode0
ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden StatesCode0
Design and implementation of intelligent packet filtering in IoT microcontroller-based devicesCode0
Large-scale Ridesharing DARP Instances Based on Real Travel DemandCode0
Human Body Shape Classification Based on a Single Image0
InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual IllusionCode0
Exploring the Practicality of Generative Retrieval on Dynamic Corpora0
BASED: Benchmarking, Analysis, and Structural Estimation of DeblurringCode0
Benchmarking Diverse-Modal Entity Linking with Generative Models0
Learning from Integral Losses in Physics Informed Neural NetworksCode0
Benchmarking state-of-the-art gradient boosting algorithms for classification0
CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical DatasetCode0
Investigation of UAV Detection in Images with Complex Backgrounds and Rainy ArtifactsCode0
Analysis of modular CMA-ES on strict box-constrained problems in the SBOX-COST benchmarking suite0
GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and BenchmarkingCode0
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer0
LAraBench: Benchmarking Arabic AI with Large Language Models0
Barkour: Benchmarking Animal-level Agility with Quadruped Robots0
R2H: Building Multimodal Navigation Helpers that Respond to Help Requests0
When the Music Stops: Tip-of-the-Tongue Retrieval for MusicCode0
Show:102550
← PrevPage 154 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified