SOTAVerified

Benchmarking

Papers

Showing 40014050 of 5548 papers

TitleStatusHype
OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation0
Open foundation models for Azerbaijani language0
Benchmarking and Evaluation of AI Models in Biology: Outcomes and Recommendations from the CZI Virtual Cells Workshop0
Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation0
Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs0
Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy0
Open Llama2 Model for the Lithuanian Language0
OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning0
Treatment Learning Causal Transformer for Noisy Image Classification0
Benchmarking and Enhancing Disentanglement in Concept-Residual Models0
Benchmarking and Comparing Multi-exposure Image Fusion Algorithms0
Tree Instance Segmentation With Temporal Contour Graph0
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT0
Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images0
Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest0
Open-set object detection: towards unified problem formulation and benchmarking0
OpenSiteRec: An Open Dataset for Site Recommendation0
Open-Source Manually Annotated Vocal Tract Database for Automatic Segmentation from 3D MRI Using Deep Learning: Benchmarking 2D and 3D Convolutional and Transformer Networks0
Benchmarking and Analyzing Generative Data for Visual Recognition0
Open the box of digital neuromorphic processor: Towards effective algorithm-hardware co-design0
Benchmarking a (μ+λ) Genetic Algorithm with Configurable Crossover Probability0
Benchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy changes upon mutation0
Benchmarking Algorithms from Machine Learning for Low-Budget Black-Box Optimization0
Benchmarking Algorithms for Automatic License Plate Recognition0
Scale MLPerf-0.6 models on Google TPU-v3 Pods0
Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation0
Opposition based Ensemble Micro Differential Evolution0
Trial-Based Dominance Enables Non-Parametric Tests to Compare both the Speed and Accuracy of Stochastic Optimizers0
Optimal Eco-driving Control of Autonomous and Electric Trucks in Adaptation to Highway Topography: Energy Minimization and Battery Life Extension0
Optimally-Weighted Maximum Mean Discrepancy Framework for Continual Learning0
Optimal PMU Placement for Kalman Filtering of DAE Power System Models0
Optimal Scheduling of Anticipated COVID-19 Vaccination: A Case Study of New York State0
Optimization of Genomic Classifiers for Clinical Deployment: Evaluation of Bayesian Optimization to Select Predictive Models of Acute Infection and In-Hospital Mortality0
Optimization Techniques for a Physical Model of Human Vocalisation0
Optimizing open-domain question answering with graph-based retrieval augmented generation0
Benchmarking air-conditioning energy performance of residential rooms based on regression and clustering techniques0
Optimizing Recommendations using Fine-Tuned LLMs0
OPTION: OPTImization Algorithm Benchmarking ONtology0
OPTION: OPTImization Algorithm Benchmarking ONtology0
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol0
Benchmarking Agility and Reconfigurability in Satellite Systems for Tropical Cyclone Monitoring0
Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning0
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks0
TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images0
OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery0
Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering0
Benchmarking Aggression Identification in Social Media0
Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition0
A critical look at the current train/test split in machine learning0
Benchmarking a foundation LLM on its ability to re-label structure names in accordance with the AAPM TG-263 report0
Show:102550
← PrevPage 81 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified