SOTAVerified

Benchmarking

Papers

Showing 50265050 of 5548 papers

TitleStatusHype
Benchmarking multi-component signal processing methods in the time-frequency planeCode0
Efficiently solving the thief orienteering problem with a max-min ant colony optimization approachCode0
A Comparative Analysis of Word-Level Metric Differential Privacy: Benchmarking The Privacy-Utility Trade-offCode0
Benchmarking MOEAs for solving continuous multi-objective RL problemsCode0
NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity RecognitionCode0
Benchmarking Model-Based Reinforcement LearningCode0
Benchmarking Misuse Mitigation Against Covert AdversariesCode0
To Find Waldo You Need Contextual Cues: Debiasing Who's WaldoCode0
Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction MethodsCode0
No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning DatasetsCode0
To Find Waldo You Need Contextual Cues: Debiasing Who’s WaldoCode0
AstroVision: Towards Autonomous Feature Detection and Description for Missions to Small Bodies Using Deep LearningCode0
AKFruitYield: Modular benchmarking and video analysis software for Azure Kinect cameras for fruit size and fruit yield estimation in apple orchardsCode0
ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden StatesCode0
NorEval: A Norwegian Language Understanding and Generation Evaluation BenchmarkCode0
A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video SummarizationCode0
Assigning Species Information to Corresponding Genes by a Sequence Labeling FrameworkCode0
ASR Benchmarking: Need for a More Representative Conversational DatasetCode0
Benchmarking missing-values approaches for predictive models on health databasesCode0
SignalGP-Lite: Event Driven Genetic Programming Library for Large-Scale Artificial Life ApplicationsCode0
Benchmarking Minimax LinkageCode0
Efficient and Effective Model ExtractionCode0
Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language RecognitionCode0
signSGD with Majority Vote is Communication Efficient And Fault TolerantCode0
To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User InteractionsCode0
Show:102550
← PrevPage 202 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified