Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4226–4250 of 5548 papers

Title	Date	Tasks	Status
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning	Oct 14, 2024	Atari GamesBenchmarking	—Unverified
Translation Canvas: An Explainable Interface to Pinpoint and Analyze Translation Systems	Oct 7, 2024	BenchmarkingMachine Translation	—Unverified
TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation	Jul 1, 2025	BenchmarkingMachine Translation	—Unverified
TransOpt: Transformer-based Representation Learning for Optimization Problem Classification	Nov 29, 2023	BenchmarkingClassification	—Unverified
TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models	Jan 9, 2024	Benchmarking	—Unverified
Treatment Learning Causal Transformer for Noisy Image Classification	Mar 29, 2022	BenchmarkingClassification	—Unverified
Tree Instance Segmentation With Temporal Contour Graph	Jan 1, 2023	BenchmarkingInstance Segmentation	—Unverified
Trial-Based Dominance Enables Non-Parametric Tests to Compare both the Speed and Accuracy of Stochastic Optimizers	Dec 19, 2022	BenchmarkingStochastic Optimization	—Unverified
Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning	Dec 5, 2019	BenchmarkingBIG-bench Machine Learning	—Unverified
TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images	Jan 25, 2024	BenchmarkingSegmentation	—Unverified
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms	May 22, 2025	Adversarial AttackBenchmarking	—Unverified
True Online TD-Replan(lambda) Achieving Planning through Replaying	Jan 31, 2025	Benchmarking	—Unverified
Trust but Verify: Programmatic VLM Evaluation in the Wild	Oct 17, 2024	BenchmarkingLanguage Modelling	—Unverified
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations	Jul 2, 2024	Benchmarkingtext-to-speech	—Unverified
Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data	Sep 23, 2023	BenchmarkingSuper-Resolution	—Unverified
U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding	May 23, 2025	BenchmarkingSpatial Reasoning	—Unverified
UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning	May 21, 2025	BenchmarkingImitation Learning	—Unverified
UAV Immersive Video Streaming: A Comprehensive Survey, Benchmarking, and Open Challenges	Oct 31, 2023	Benchmarking	—Unverified
UCCIX: Irish-eXcellence Large Language Model	May 13, 2024	BenchmarkingLanguage Modeling	—Unverified
UCLID-Net: Single View Reconstruction in Object Space	Jun 6, 2020	BenchmarkingDecoder	—Unverified
UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite	Apr 18, 2023	BenchmarkingInstance Segmentation	—Unverified
UGSL: A Unified Framework for Benchmarking Graph Structure Learning	Aug 21, 2023	BenchmarkingGraph structure learning	—Unverified
UKAN: Unbound Kolmogorov-Arnold Network Accompanied with Accelerated Library	Aug 20, 2024	BenchmarkingComputational Efficiency	—Unverified
Unbounded Bayesian Optimization via Regularization	Aug 14, 2015	Bayesian OptimizationBenchmarking	—Unverified
Uncertainty estimation for Cross-dataset performance in Trajectory prediction	May 15, 2022	BenchmarkingPrediction	—Unverified

Show:10 25 50

← PrevPage 170 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified