SOTAVerified

Benchmarking

Papers

Showing 40514075 of 5548 papers

TitleStatusHype
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks0
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents0
oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving0
Benchmarking Adversarial Robustness of Compressed Deep Learning Models0
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms0
Out of Distribution Performance of State of Art Vision Model0
Benchmarking Adversarial Robustness0
Overconfident Oracles: Limitations of In Silico Sequence Design Benchmarking0
Overview and practical recommendations on using Shapley Values for identifying predictive biomarkers via CATE modeling0
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving0
Benchmarking Adversarially Robust Quantum Machine Learning at Scale0
OVQA: A Clinically Generated Visual Question Answering Dataset0
Paddy Doctor: A Visual Image Dataset for Automated Paddy Disease Classification and Benchmarking0
Benchmarking adversarial attacks and defenses for time-series data0
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms0
Benchmarking Advanced Text Anonymisation Methods: A Comparative Study on Novel and Traditional Approaches0
Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration0
Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances0
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool0
Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis0
Benchmarking Active Learning Strategies for Materials Optimization and Discovery0
A critical analysis of metrics used for measuring progress in artificial intelligence0
True Online TD-Replan(lambda) Achieving Planning through Replaying0
Benchmarking Active Learning for NILM0
Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles0
Show:102550
← PrevPage 163 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified