SOTAVerified

Benchmarking

Papers

Showing 31763200 of 5548 papers

TitleStatusHype
The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMACode0
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting0
Evaluating Deep Clustering Algorithms on Non-Categorical 3D CAD Models0
On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks0
MileBench: Benchmarking MLLMs in Long Context0
Detecting critical treatment effect bias in small subgroupsCode0
Leak Proof CMap; a framework for training and evaluation of cell line agnostic L1000 similarity methodsCode0
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models0
Stochastic Spiking Neural Networks with First-to-Spike Coding0
CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint MatchingCode0
Benchmarking Mobile Device Control Agents across Diverse Configurations0
DPO: A Differential and Pointwise Control Approach to Reinforcement Learning0
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey beesCode0
Empirical Analysis of the Dynamic Binary Value Problem with IOHprofiler0
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image ClassificationCode0
The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking0
Benchmarking Advanced Text Anonymisation Methods: A Comparative Study on Novel and Traditional Approaches0
Open Datasets for Satellite Radio Resource Control0
TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos0
EnzChemRED, a rich enzyme chemistry relation extraction dataset0
In-situ process monitoring and adaptive quality enhancement in laser additive manufacturing: a critical review0
Authentic Emotion Mapping: Benchmarking Facial Expressions in Real NewsCode0
Bridging the Gap Between Theory and Practice: Benchmarking Transfer Evolutionary Optimization0
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning0
Integrated Sensing and Communication enabled Multiple Base Stations Cooperative UAV Detection0
Show:102550
← PrevPage 128 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified