SOTAVerified

Benchmarking

Papers

Showing 36013650 of 5548 papers

TitleStatusHype
A Survey on Preserving Fairness Guarantees in Changing Environments0
Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences0
A Benchmark for Out of Distribution Detection in Point Cloud 3D Semantic Segmentation0
A Benchmarking Dataset with 2440 Organic Molecules for Volume Distribution at Steady StateCode0
EvEntS ReaLM: Event Reasoning of Entity States via Language Models0
Hyperparameter optimization in deep multi-target predictionCode1
Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation0
Okapi: Generalising Better by Making Statistical Matches MatchCode0
Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories0
Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation ThreadsCode0
The Legal Argument Reasoning Task in Civil ProcedureCode0
EventEA: Benchmarking Entity Alignment for Event-centric Knowledge GraphsCode1
An approach for benchmarking the numerical solutions of stochastic compartmental models0
Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning0
Quantum Similarity Testing with Convolutional Neural Networks0
Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset0
Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language RecognitionCode0
SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates0
Classical ensemble of Quantum-classical ML algorithms for Phishing detection in Ethereum transaction networksCode0
Benchmarking Adversarial Patch Against Aerial DetectionCode1
Benchmarking performance of object detection under image distortions in an uncontrolled environmentCode0
Benchmarking Language Models for Code Syntax UnderstandingCode1
What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?Code0
pmuBAGE: The Benchmarking Assortment of Generated PMU Data for Power System EventsCode0
CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and SummarizationCode0
A Comparative Attention Framework for Better Few-Shot Object Detection on Aerial ImagesCode1
Deep Crowd Anomaly Detection: State-of-the-Art, Challenges, and Future Research Directions0
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?0
ESB: A Benchmark For Multi-Domain End-to-End Speech RecognitionCode1
SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural NetworksCode1
Benchmarking GPU and TPU Performance with Graph Neural Networks0
Multi-scale data reconstruction of turbulent rotating flows with Gappy POD, Extended POD and Generative Adversarial Networks0
A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research ChallengesCode1
gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs0
RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator ControlCode1
LaMAR: Benchmarking Localization and Mapping for Augmented RealityCode2
Graphs, Constraints, and Search for the Abstraction and Reasoning CorpusCode1
iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylationsCode1
FIMP: Foundation Model-Informed Message Passing for Graph Neural Networks0
Conditional Neural Processes for Molecules0
Sub-8-bit quantization for on-device speech recognition: a regularization-free approach0
KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial DocumentsCode1
An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality RecognitionCode1
DyFEn: Agent-Based Fee Setting in Payment Channel Networks0
WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based EnvironmentsCode1
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and RethinkingCode1
A Survey of Parameters Associated with the Quality of Benchmarks in NLP0
TweetNERD -- End to End Entity Linking Benchmark for TweetsCode0
CAB: Comprehensive Attention Benchmarking on Long Sequence ModelingCode1
CORL: Research-oriented Deep Offline Reinforcement Learning LibraryCode3
Show:102550
← PrevPage 73 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified