Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4051–4100 of 5548 papers

Title	Date	Tasks	Status
Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis	Nov 27, 2023	BenchmarkingDiagnostic	—Unverified
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data	Oct 6, 2022	BenchmarkingRepresentation Learning	—Unverified
Synplex: A synthetic simulator of highly multiplexed histological images	Mar 8, 2021	Benchmarking	—Unverified
Syntactically Aware Neural Architectures for Definition Extraction	Jun 1, 2018	BenchmarkingBinary Classification	—Unverified
Syntax Encoding with Application in Authorship Attribution	Oct 1, 2018	Authorship AttributionBenchmarking	—Unverified
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms	Jul 3, 2023	BenchmarkingCamera Calibration	—Unverified
Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications	Nov 4, 2019	BenchmarkingGesture Recognition	—Unverified
Synthetic weather radar using hybrid quantum-classical machine learning	Nov 30, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy	Feb 24, 2025	BenchmarkingImage Generation	—Unverified
SysML'19 demo: customizable and reusable Collective Knowledge pipelines to automate and reproduce machine learning experiments	Mar 31, 2019	BenchmarkingBIG-bench Machine Learning	—Unverified
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency	Jul 1, 2023	BenchmarkingData Augmentation	—Unverified
Systematic Comparison of Path Planning Algorithms using PathBench	Mar 7, 2022	Benchmarking	—Unverified
Systematic Review: Anomaly Detection in Connected and Autonomous Vehicles	May 4, 2024	Anomaly DetectionArticles	—Unverified
SzCORE as a benchmark: report from the seizure detection challenge at the 2025 AI in Epilepsy and Neurological Disorders Conference	May 19, 2025	BenchmarkingEEG	—Unverified
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts	Dec 5, 2024	BenchmarkingImage Generation	—Unverified
T^2K^2: The Twitter Top-K Keywords Benchmark	Sep 14, 2017	BenchmarkingInformation Retrieval	—Unverified
TabKAN: Advancing Tabular Data Analysis using Kolmogorov-Arnold Network	Apr 9, 2025	BenchmarkingDeep Learning	—Unverified
TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer	Jan 2, 2025	BenchmarkingQuantization	—Unverified
TabularQGAN: A Quantum Generative Model for Tabular Data	May 28, 2025	BenchmarkingGenerative Adversarial Network	—Unverified
Tackling the Story Ending Biases in The Story Cloze Test	Jul 1, 2018	BenchmarkingCloze Test	—Unverified
Tackling Visual Control via Multi-View Exploration Maximization	Nov 28, 2022	BenchmarkingReinforcement Learning (RL)	—Unverified
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding	Jan 16, 2024	Action RecognitionBenchmarking	—Unverified
Tactile MNIST: Benchmarking Active Tactile Perception	Jun 3, 2025	BenchmarkingScene Understanding	—Unverified
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics	Mar 3, 2025	BenchmarkingSpoken Dialogue Systems	—Unverified
TARGET: Benchmarking Table Retrieval for Generative Tasks	May 14, 2025	BenchmarkingRepresentation Learning	—Unverified
Efficient Demand Response Location Targeting for Price Spike Mitigation by Exploiting Price-demand Relationship	Nov 27, 2022	Benchmarking	—Unverified
TARGO: Benchmarking Target-driven Object Grasping under Occlusions	Jul 8, 2024	BenchmarkingObject	—Unverified
Task-oriented Over-the-air Computation for Edge-device Co-inference with Balanced Classification Accuracy	Jul 1, 2024	Benchmarking	—Unverified
TBD: Benchmarking and Analyzing Deep Neural Network Training	Mar 16, 2018	BenchmarkingGeneral Classification	—Unverified
TDDBench: A Benchmark for Training data detection	Nov 5, 2024	BenchmarkingComputational Efficiency	—Unverified
TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs	May 26, 2025	BenchmarkingLarge Language Model	—Unverified
TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos	Apr 22, 2024	BenchmarkingMulti-Object Tracking	—Unverified
Teaspoon: A comprehensive python package for topological signal processing	Oct 10, 2020	BenchmarkingTopological Data Analysis	—Unverified
Technical report of a DMD-based Characterization Method for Vision Sensors	Mar 4, 2025	BenchmarkingDataset Generation	—Unverified
Technological Approaches to Detecting Online Disinformation and Manipulation	Aug 26, 2021	BenchmarkingFact Checking	—Unverified
TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain	Dec 20, 2024	Benchmarking	—Unverified
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks	May 19, 2023	Benchmarking	—Unverified
Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation	Nov 8, 2022	BenchmarkingRetrieval	—Unverified
Temporal cross-validation impacts multivariate time series subsequence anomaly detection evaluation	Jun 13, 2025	Anomaly DetectionBenchmarking	—Unverified
Temporal Graphs Anomaly Emergence Detection: Benchmarking For Social Media Interactions	Jul 11, 2023	Anomaly DetectionBenchmarking	—Unverified
Temporal Validity Change Prediction	Jan 1, 2024	BenchmarkingPrediction	—Unverified
TEP-GNN: Accurate Execution Time Prediction of Functional Tests using Graph Neural Networks	Aug 25, 2022	BenchmarkingGraph Neural Network	—Unverified
Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney	Aug 4, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Term-Class-Max-Support (TCMS): A Simple Text Document Categorization Approach Using Term-Class Relevance Measure	Oct 16, 2016	BenchmarkingText Categorization	—Unverified
Test-driven Software Experimentation with LASSO: an LLM Prompt Benchmarking Example	Oct 11, 2024	BenchmarkingCode Generation	—Unverified
Tetrad: Actively Secure 4PC for Secure Training and Inference	Jun 5, 2021	BenchmarkingFairness	—Unverified
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation	Feb 18, 2025	Benchmarking	—Unverified
Text-To-Speech Synthesis In The Wild	Sep 13, 2024	BenchmarkingSpeaker Recognition	—Unverified
Thalamic nuclei segmentation from T_1-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts	Sep 26, 2023	BenchmarkingSegmentation	—Unverified
The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition	Feb 29, 2024	Action Unit DetectionArousal Estimation	—Unverified

Show:10 25 50

← PrevPage 82 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified