SOTAVerified

Benchmarking

Papers

Showing 28512860 of 5548 papers

TitleStatusHype
No Dataset Needed for Downstream Knowledge Benchmarking: Response Dispersion Inversely Correlates with Accuracy on Domain-specific QA0
Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory0
Open Llama2 Model for the Lithuanian Language0
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection0
S3Simulator: A benchmarking Side Scan Sonar Simulator dataset for Underwater Image AnalysisCode0
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures0
Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification0
WCEbleedGen: A wireless capsule endoscopy dataset and its benchmarking for automatic bleeding classification, detection, and segmentationCode0
MultiMed: Massively Multimodal and Multitask Medical Understanding0
Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis0
Show:102550
← PrevPage 286 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified