SOTAVerified

Benchmarking

Papers

Showing 52765300 of 5548 papers

TitleStatusHype
C-TLSAN: Content-Enhanced Time-Aware Long- and Short-Term Attention Network for Personalized RecommendationCode0
Performance Evaluation of Real-Time Object Detection for Electric ScootersCode0
Benchmarking Framework for Performance-Evaluation of Causal Inference AnalysisCode0
A General Benchmarking Framework for Text GenerationCode0
Performance Modeling of Data Storage Systems using Generative ModelsCode0
Zero-Shot Hyperspectral Pansharpening Using Hysteresis-Based Tuning for Spectral Quality ControlCode0
Vector-Based Data Improves Left-Right Eye-Tracking Classifier Performance After a Covariate Distributional ShiftCode0
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World KnowledgeCode0
Periodic Extrapolative Generalisation in Neural NetworksCode0
Standardizing Structural Causal ModelsCode0
Standard Vs Uniform Binary Search and Their Variants in Learned Static Indexing: The Case of the Searching on Sorted Data Benchmarking Software PlatformCode0
StarBASE-GP: Biologically-Guided Automated Machine Learning for Genotype-to-Phenotype Association AnalysisCode0
Benchmarking framework for machine learning classification from fNIRS dataCode0
PersoBench: Benchmarking Personalized Response Generation in Large Language ModelsCode0
STA: Self-controlled Text Augmentation for Improving Text ClassificationsCode0
Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical SegmentationCode0
XCompress: LLM assisted Python-based text compression toolkitCode0
A Framework for Generating Informative Benchmark InstancesCode0
What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?Code0
Towards Robust Metrics for Concept Representation EvaluationCode0
Statistical Multicriteria Evaluation of LLM-Generated TextCode0
ANTHROPOS-V: benchmarking the novel task of Crowd Volume EstimationCode0
Answer Consolidation: Formulation and BenchmarkingCode0
A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting ApproachesCode0
A novel evaluation methodology for supervised Feature Ranking algorithmsCode0
Show:102550
← PrevPage 212 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified