SOTAVerified

Benchmarking

Papers

Showing 30013025 of 5548 papers

TitleStatusHype
Are SNNs Truly Energy-efficient? - A Hardware Perspective0
AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models0
A skeletonization algorithm for gradient-based optimizationCode1
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking0
Transfer Learning between Motor Imagery Datasets using Deep Learning -- Validation of Framework and Comparison of DatasetsCode0
Benchmarking Large Language Models in Retrieval-Augmented GenerationCode2
Hybrid data driven/thermal simulation model for comfort assessment0
Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow SimulationCode1
Orientation-Independent Chinese Text Recognition in Scene ImagesCode2
FOR-instance: a UAV laser scanning benchmark dataset for semantic and instance segmentation of individual trees0
Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction0
NeMig -- A Bilingual News Collection and Knowledge Graph about MigrationCode0
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning0
Can humans help BERT gain "confidence"?0
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph EngineeringCode1
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO0
Benchmarking Multilabel Topic Classification in the Kyrgyz LanguageCode0
Benchmarking the Generation of Fact Checking ExplanationsCode1
Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadataCode1
Matbench Discovery -- A framework to evaluate machine learning crystal stability predictionsCode3
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads0
MLLM-DataEngine: An Iterative Refinement Approach for MLLMCode1
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models0
Beyond Document Page Classification: Design, Datasets, and ChallengesCode0
Topical-Chat: Towards Knowledge-Grounded Open-Domain ConversationsCode2
Show:102550
← PrevPage 121 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified