SOTAVerified

Benchmarking

Papers

Showing 211220 of 5548 papers

TitleStatusHype
Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and BeyondCode2
FedGraph: A Research Library and Benchmark for Federated Graph LearningCode2
MIBench: A Comprehensive Framework for Benchmarking Model Inversion Attack and DefenseCode2
dattri: A Library for Efficient Data AttributionCode2
AutoPenBench: Benchmarking Generative Agents for Penetration TestingCode2
Beyond Prompts: Dynamic Conversational Benchmarking of Large Language ModelsCode2
A Survey on Graph Neural Networks for Remaining Useful Life Prediction: Methodologies, Evaluation and Future TrendsCode2
Small Language Models: Survey, Measurements, and InsightsCode2
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual LocalizationCode2
A Survey on Multimodal Benchmarks: In the Era of Large AI ModelsCode2
Show:102550
← PrevPage 22 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified