Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3876–3900 of 5548 papers

Title	Date	Tasks	Status
SAIBench: Benchmarking AI for Science	Jun 11, 2022	BenchmarkingFriction	—Unverified
Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics	Apr 27, 2017	AllBenchmarking	—Unverified
Salient Object Detection: A Benchmark	Jan 5, 2015	BenchmarkingObject	—Unverified
SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models	May 24, 2025	BenchmarkingVideo Grounding	—Unverified
SAM-based instance segmentation models for the automation of structural damage detection	Jan 27, 2024	BenchmarkingInstance Segmentation	—Unverified
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection	Sep 29, 2023	BenchmarkingDiversity	—Unverified
SASSE: Scalable and Adaptable 6-DOF Pose Estimation	Feb 5, 2019	BenchmarkingPose Estimation	—Unverified
SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas	May 20, 2025	BenchmarkingLogical Reasoning	—Unverified
SAWNet: A Spatially Aware Deep Neural Network for 3D Point Cloud Processing	May 18, 2019	BenchmarkingScene Segmentation	—Unverified
Scaffold Splits Overestimate Virtual Screening Performance	Jun 2, 2024	BenchmarkingClustering	—Unverified
Scalable and Customizable Benchmark Problems for Many-Objective Optimization	Jan 26, 2020	BenchmarkingPosition	—Unverified
Scalable and Hybrid Ensemble-Based Causality Discovery	Dec 24, 2020	BenchmarkingDistributed Computing	—Unverified
Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for Enhanced Deep Learning Performance and Efficiency	Apr 26, 2023	BenchmarkingCloud Computing	—Unverified
Scalable Psychological Momentum Forecasting in Esports	Jan 30, 2020	Benchmarking	—Unverified
Automated Coding of Communications in Collaborative Problem-solving Tasks Using ChatGPT	Nov 15, 2024	Benchmarking	—Unverified
ScanNeRF: a Scalable Benchmark for Neural Radiance Fields	Nov 24, 2022	BenchmarkingNeRF	—Unverified
SCBench: A Sports Commentary Benchmark for Video LLMs	Dec 23, 2024	Benchmarking	—Unverified
Scenarios and Approaches for Situated Natural Language Explanations	Jun 7, 2024	BenchmarkingIn-Context Learning	—Unverified
ScholarSearch: Benchmarking Scholar Searching Ability of LLMs	Jun 11, 2025	BenchmarkingInformation Retrieval	—Unverified
SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement	Sep 28, 2024	BenchmarkingCode Generation	—Unverified
Science Across Languages: Assessing LLM Multilingual Translation of Scientific Papers	Feb 25, 2025	ArticlesBenchmarking	—Unverified
Scientific Machine Learning Benchmarks	Oct 25, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models	Mar 12, 2025	BenchmarkingFairness	—Unverified
scMamba: A Scalable Foundation Model for Single-Cell Multi-Omics Integration Beyond Highly Variable Feature Selection	Jun 25, 2025	BenchmarkingContrastive Learning	—Unverified
Score-Based Generative Models for Molecule Generation	Mar 7, 2022	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 156 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified