SOTAVerified

Benchmarking

Papers

Showing 601625 of 5548 papers

TitleStatusHype
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAMCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object DetectionCode1
Benchmarking Language Models for Code Syntax UnderstandingCode1
DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender SystemsCode1
Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language ModelsCode1
DACBench: A Benchmark Library for Dynamic Algorithm ConfigurationCode1
CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language ModelsCode1
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
D2S: Document-to-Slide Generation Via Query-Based Text SummarizationCode1
Data-Driven Denoising of Stationary Accelerometer SignalsCode1
Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRTCode1
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo MethodsCode1
Curious Hierarchical Actor-Critic Reinforcement LearningCode1
MatTools: Benchmarking Large Language Models for Materials Science ToolsCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
Data Generating Process to Evaluate Causal Discovery Techniques for Time Series DataCode1
dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal ProcessingCode1
Benchmarking Graph Neural Networks for FMRI analysisCode1
Amharic LLaMA and LLaVA: Multimodal LLMs for Low Resource LanguagesCode1
Benchmarking Graph Neural Networks on Dynamic Link PredictionCode1
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image SegmentationCode1
Benchmarking Geospatial Question Answering Engines using the Dataset GeoQuestions1089Code1
ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and LocalizationCode1
COVID-19 event extraction from Twitter via extractive question answering with continuous promptsCode1
Show:102550
← PrevPage 25 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified