SOTAVerified

Benchmarking

Papers

Showing 40514100 of 5548 papers

TitleStatusHype
Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis0
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data0
Synplex: A synthetic simulator of highly multiplexed histological images0
Syntactically Aware Neural Architectures for Definition Extraction0
Syntax Encoding with Application in Authorship Attribution0
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms0
Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications0
Synthetic weather radar using hybrid quantum-classical machine learning0
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy0
SysML'19 demo: customizable and reusable Collective Knowledge pipelines to automate and reproduce machine learning experiments0
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency0
Systematic Comparison of Path Planning Algorithms using PathBench0
Systematic Review: Anomaly Detection in Connected and Autonomous Vehicles0
SzCORE as a benchmark: report from the seizure detection challenge at the 2025 AI in Epilepsy and Neurological Disorders Conference0
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts0
T^2K^2: The Twitter Top-K Keywords Benchmark0
TabKAN: Advancing Tabular Data Analysis using Kolmogorov-Arnold Network0
TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer0
TabularQGAN: A Quantum Generative Model for Tabular Data0
Tackling the Story Ending Biases in The Story Cloze Test0
Tackling Visual Control via Multi-View Exploration Maximization0
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding0
Tactile MNIST: Benchmarking Active Tactile Perception0
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics0
TARGET: Benchmarking Table Retrieval for Generative Tasks0
Efficient Demand Response Location Targeting for Price Spike Mitigation by Exploiting Price-demand Relationship0
TARGO: Benchmarking Target-driven Object Grasping under Occlusions0
Task-oriented Over-the-air Computation for Edge-device Co-inference with Balanced Classification Accuracy0
TBD: Benchmarking and Analyzing Deep Neural Network Training0
TDDBench: A Benchmark for Training data detection0
TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs0
TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos0
Teaspoon: A comprehensive python package for topological signal processing0
Technical report of a DMD-based Characterization Method for Vision Sensors0
Technological Approaches to Detecting Online Disinformation and Manipulation0
TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain0
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks0
Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation0
Temporal cross-validation impacts multivariate time series subsequence anomaly detection evaluation0
Temporal Graphs Anomaly Emergence Detection: Benchmarking For Social Media Interactions0
Temporal Validity Change Prediction0
TEP-GNN: Accurate Execution Time Prediction of Functional Tests using Graph Neural Networks0
Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney0
Term-Class-Max-Support (TCMS): A Simple Text Document Categorization Approach Using Term-Class Relevance Measure0
Test-driven Software Experimentation with LASSO: an LLM Prompt Benchmarking Example0
Tetrad: Actively Secure 4PC for Secure Training and Inference0
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation0
Text-To-Speech Synthesis In The Wild0
Thalamic nuclei segmentation from T_1-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts0
The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition0
Show:102550
← PrevPage 82 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified