SOTAVerified

Benchmarking

Papers

Showing 12261250 of 5548 papers

TitleStatusHype
FragXsiteDTI: Revealing Responsible Segments in Drug-Target Interaction with Transformer-Driven InterpretationCode1
fseval: A Benchmarking Framework for Feature Selection and Feature Ranking AlgorithmsCode1
FTNet: Feature Transverse Network for Thermal Image Semantic SegmentationCode1
BARS-CTR: Open Benchmarking for Click-Through Rate PredictionCode1
G4SATBench: Benchmarking and Advancing SAT Solving with Graph Neural NetworksCode1
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual DependencyCode1
Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge GraphCode1
Comprehensive benchmarking of large language models for RNA secondary structure predictionCode1
Benchmarking Language Models for Code Syntax UnderstandingCode1
Benchmarking Test-Time Adaptation against Distribution Shifts in Image ClassificationCode1
Benchmarking: Past, Present and FutureCode1
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event ExtractionCode1
Generalizable deep learning for photoplethysmography-based blood pressure estimation -- A Benchmarking StudyCode1
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMMCode1
Generative Evaluation of Complex Reasoning in Large Language ModelsCode1
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument RolesCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
Benchmarking Language Model Creativity: A Case Study on Code GenerationCode1
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity QuantificationCode1
CombiBench: Benchmarking LLM Capability for Combinatorial MathematicsCode1
A Comprehensive Benchmark for RNA 3D Structure-Function ModelingCode1
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule GenerationCode1
Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIsCode1
Combinatorial Optimization with Policy Adaptation using Latent Space SearchCode1
Show:102550
← PrevPage 50 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified