SOTAVerified

Benchmarking

Papers

Showing 13261350 of 5548 papers

TitleStatusHype
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking0
A survey of probabilistic generative frameworks for molecular simulationsCode0
Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and ForecastsCode3
BEARD: Benchmarking the Adversarial Robustness for Dataset DistillationCode0
Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and DatasetCode0
A Survey on Vision Autoregressive Model0
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere0
FM-TS: Flow Matching for Time Series GenerationCode1
Evaluating the Generation of Spatial Relations in Text and Image Generative Models0
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context EvaluationCode0
BuckTales : A multi-UAV dataset for multi-object tracking and re-identification of wild antelopes0
General Geospatial Inference with a Population Dynamics Foundation ModelCode3
Benchmarking LLMs' Judgments with No Gold StandardCode0
Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantificationCode1
MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design0
Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication0
Benchmarking 3D multi-coil NC-PDNet MRI reconstruction0
FactLens: Benchmarking Fine-Grained Fact Verification0
Open-set object detection: towards unified problem formulation and benchmarking0
Benchmarking Distributional Alignment of Large Language ModelsCode0
A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics0
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding0
Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale0
Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis0
HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images0
Show:102550
← PrevPage 54 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified