SOTAVerified

Benchmarking

Papers

Showing 48014850 of 5548 papers

TitleStatusHype
VidLBEval: Benchmarking and Mitigating Language Bias in Video-Involved LVLMs0
SPot: A tool for identifying operating segments in financial tables0
Spotting tell-tale visual artifacts in face swapping videos: strengths and pitfalls of CNN detectors0
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark0
SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads0
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos0
Unifying Large Language Model and Deep Reinforcement Learning for Human-in-Loop Interactive Socially-aware Navigation0
SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset0
Analysing Features Learned Using Unsupervised Models on Program Embeddings0
Stability Constrained OPF in Microgrids: A Chance Constrained Optimization Framework with Non-Gaussian Uncertainty0
Stabilized Self-training with Negative Sampling on Few-labeled Graph Data0
Analysing Errors of Open Information Extraction Systems0
An AI based talent acquisition and benchmarking for job0
Stable Virtual Camera: Generative View Synthesis with Diffusion Models0
An Advanced Ensemble Deep Learning Framework for Stock Price Prediction Using VAE, Transformer, and LSTM Model0
Staining normalization in histopathology: Method benchmarking using multicenter dataset0
Standardisation of Convex Ultrasound Data Through Geometric Analysis and Augmentation0
Standardised workflow for mass spectrometry-based single-cell proteomics data processing and analysis using the scp package0
CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing0
Word Complexity Estimation for Japanese Lexical Simplification0
A Boosting Approach to Constructing an Ensemble Stack0
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground0
An Accelerated Correlation Filter Tracker0
Village-Net Clustering: A Rapid approach to Non-linear Unsupervised Clustering of High-Dimensional Data0
State and Memory is All You Need for Robust and Reliable AI Agents0
State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects0
CroCoDL: Cross-device Collaborative Dataset for Localization0
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models0
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models0
Cross-functional transferability in universal machine learning interatomic potentials0
State-of-the-Art in Human Scanpath Prediction0
Statistical Multicriteria Benchmarking via the GSD-Front0
Abnormality-Driven Representation Learning for Radiology Imaging0
crossMoDA Challenge: Evolution of Cross-Modality Domain Adaptation Techniques for Vestibular Schwannoma and Cochlea Segmentation from 2021 to 20230
CRF-based Single-stage Acoustic Modeling with CTC Topology0
Cross-Model Image Annotation Platform with Active Learning0
Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability0
Cross-replication Reliability - An Empirical Approach to Interpreting Inter-rater Reliability0
Cross-subject Brain Functional Connectivity Analysis for Multi-task Cognitive State Evaluation0
Cross-Subject Deep Transfer Models for Evoked Potentials in Brain-Computer Interface0
Creating a Data Collection for Evaluating Rich Speech Retrieval0
CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization0
CRS Arena: Crowdsourced Benchmarking of Conversational Recommender Systems0
Statistical Scenario Modelling and Lookalike Distributions for Multi-Variate AI Risk0
Covariance Matrix Adaptation Evolution Strategy Assisted by Principal Component Analysis0
Coupling volume-excluding compartment-based models of diffusion at different scales: Voronoi and pseudo-compartment approaches0
CSPO: Cross-Market Synergistic Stock Price Movement Forecasting with Pseudo-volatility Optimization0
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories0
StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic0
Steerable Pyramid Weighted Loss: Multi-Scale Adaptive Weighting for Semantic Segmentation0
Show:102550
← PrevPage 97 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified