SOTAVerified

Benchmarking

Papers

Showing 27762800 of 5548 papers

TitleStatusHype
A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News0
GiCCS: A German in-Context Conversational Similarity Benchmark0
GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking0
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra0
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms0
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems0
The Benchmark Lottery0
Global Rice Multi-Class Segmentation Dataset (RiceSEG): A Comprehensive and Diverse High-Resolution RGB-Annotated Images for the Development and Benchmarking of Rice Segmentation Algorithms0
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods0
Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding0
GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation0
GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System0
A Benchmark for Multi-speaker Anonymization0
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior0
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks0
GNUMAP: A Parameter-Free Approach to Unsupervised Dimensionality Reduction via Graph Neural Networks0
Goal-Driven Sequential Data Abstraction0
A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation0
Domain Adaptation with Joint Learning for Generic, Optical Car Part Recognition and Detection Systems (Go-CaRD)0
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding0
The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI0
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models0
GreenPCO: An Unsupervised Lightweight Point Cloud Odometry Method0
Ahead-of-Time P-Tuning0
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding0
Show:102550
← PrevPage 112 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified