Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4001–4025 of 5548 papers

Title	Date	Tasks	Status
SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields	Nov 22, 2022	3D Inpainting3D Instance Segmentation	—Unverified
Spintronics for image recognition: performance benchmarking via ultrafast data-driven simulations	Aug 10, 2023	BenchmarkingClassification	—Unverified
SpiralMLP: A Lightweight Vision MLP Architecture	Mar 31, 2024	Benchmarking	—Unverified
SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs	May 25, 2025	BenchmarkingDiversity	—Unverified
Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video	Jun 21, 2024	BenchmarkingFew-Shot Learning	—Unverified
SPot: A tool for identifying operating segments in financial tables	May 17, 2020	Benchmarking	—Unverified
Spotting tell-tale visual artifacts in face swapping videos: strengths and pitfalls of CNN detectors	Jun 19, 2025	BenchmarkingFace Swapping	—Unverified
SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads	Jul 8, 2025	Benchmarking	—Unverified
Unifying Large Language Model and Deep Reinforcement Learning for Human-in-Loop Interactive Socially-aware Navigation	Mar 22, 2024	BenchmarkingDeep Reinforcement Learning	—Unverified
SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset	Oct 29, 2024	3D ReconstructionAutonomous Driving	—Unverified
Stability Constrained OPF in Microgrids: A Chance Constrained Optimization Framework with Non-Gaussian Uncertainty	Feb 4, 2023	Benchmarking	—Unverified
Stabilized Self-training with Negative Sampling on Few-labeled Graph Data	Sep 29, 2021	BenchmarkingNode Classification	—Unverified
Stable Virtual Camera: Generative View Synthesis with Diffusion Models	Mar 18, 2025	Benchmarking	—Unverified
Staining normalization in histopathology: Method benchmarking using multicenter dataset	Jun 23, 2025	Benchmarking	—Unverified
Standardisation of Convex Ultrasound Data Through Geometric Analysis and Augmentation	Feb 13, 2025	Benchmarking	—Unverified
Standardised workflow for mass spectrometry-based single-cell proteomics data processing and analysis using the scp package	Oct 20, 2023	Benchmarking	—Unverified
CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing	Apr 14, 2020	BenchmarkingGeneral Classification	—Unverified
State and Memory is All You Need for Robust and Reliable AI Agents	Jun 30, 2025	AllBenchmarking	—Unverified
State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects	Jan 2, 2025	BenchmarkingFace Swapping	—Unverified
State-of-the-Art in Human Scanpath Prediction	Feb 24, 2021	BenchmarkingPrediction	—Unverified
Statistical Multicriteria Benchmarking via the GSD-Front	Jun 6, 2024	Benchmarking	—Unverified
Statistical Scenario Modelling and Lookalike Distributions for Multi-Variate AI Risk	Feb 20, 2025	Benchmarking	—Unverified
StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic	Aug 22, 2022	BenchmarkingStance Detection	—Unverified
Steerable Pyramid Weighted Loss: Multi-Scale Adaptive Weighting for Semantic Segmentation	Mar 9, 2025	Autonomous DrivingBenchmarking	—Unverified
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models	Feb 18, 2025	BenchmarkingLarge Language Model	—Unverified

Show:10 25 50

← PrevPage 161 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified