SOTAVerified

Benchmarking

Papers

Showing 11511200 of 5548 papers

TitleStatusHype
Deep Learning-Based Synchronization for Uplink NB-IoTCode1
Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning AlgorithmsCode1
The VoicePrivacy 2020 Challenge Evaluation PlanCode1
Federated Learning Under Intermittent Client Availability and Time-Varying Communication ConstraintsCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasksCode1
BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose EstimationCode1
GenISP: Neural ISP for Low-Light Machine CognitionCode1
Benchmarking Econometric and Machine Learning Methodologies in NowcastingCode1
Creating a Forensic Database of Shoeprints from Online Shoe Tread PhotosCode1
Continual Learning with Foundation Models: An Empirical Study of Latent ReplayCode1
A global analysis of metrics used for measuring performance in natural language processingCode1
NICO++: Towards Better Benchmarking for Domain GeneralizationCode1
Stress-Testing Point Cloud Registration on Automotive LiDARCode1
Deep learning model solves change point detection for multiple change typesCode1
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery LocalizationCode1
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet DatasetsCode1
BioRED: A Rich Biomedical Relation Extraction DatasetCode1
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue SystemsCode1
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
Coarse-to-Fine Q-attention with Learned Path RankingCode1
Parameter-efficient Model Adaptation for Vision TransformersCode1
Earnings-22: A Practical Benchmark for Accents in the WildCode1
Visual Abductive ReasoningCode1
Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative ComprehensionCode1
Benchmarking Visual Localization for Autonomous NavigationCode1
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language ModelsCode1
Sionna: An Open-Source Library for Next-Generation Physical Layer ResearchCode1
SHEL5K: An Extended Dataset and Benchmarking for Safety Helmet DetectionCode1
ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRICode1
ClearPose: Large-scale Transparent Object Dataset and BenchmarkCode1
Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain GapCode1
ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial PatchesCode1
SurvSet: An open-source time-to-event dataset repositoryCode1
The importance of being constrained: dealing with infeasible solutions in Differential Evolution and beyondCode1
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy DetectionCode1
Just Rank: Rethinking Evaluation with Word and Sentence SimilaritiesCode1
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object InteractionCode1
Mukayese: Turkish NLP Strikes BackCode1
3D Common Corruptions and Data AugmentationCode1
GraphWorld: Fake Graphs Bring Real Insights for GNNsCode1
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support SystemsCode1
MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution ImageryCode1
Benchmarking of DL Libraries and Models on Mobile DevicesCode1
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training ConflictsCode1
What are the best systems? New perspectives on NLP BenchmarkingCode1
ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core LearningCode1
Benchmarking Deep Models for Salient Object DetectionCode1
Benchmarking and Analyzing Point Cloud Classification under CorruptionsCode1
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitroCode1
Show:102550
← PrevPage 24 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified