SOTAVerified

Benchmarking

Papers

Showing 54515500 of 5548 papers

TitleStatusHype
Fast, approximate kinetics of RNA folding0
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding0
Technological Approaches to Detecting Online Disinformation and Manipulation0
FastDraft: How to Train Your Draft0
Fast Empirical Scenarios0
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation0
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging0
Fast Labeling and Transcription with the Speechalyzer Toolkit0
TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain0
Fast Training of Deep Networks with One-Class CNNs0
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks0
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration0
Feasibility of BERT Embeddings For Domain-Specific Knowledge Mining0
Cancer-Net PCa-Seg: Benchmarking Deep Learning Models for Prostate Cancer Segmentation Using Synthetic Correlated Diffusion Imaging0
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems0
Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation0
Feature Encodings for Gradient Boosting with Automunge0
AI-ready Snow Radar Echogram Dataset (SRED) for climate change monitoring0
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features0
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines0
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach0
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation0
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation0
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data0
Can Carbon-Aware Electric Load Shifting Reduce Emissions? An Equilibrium-Based Analysis0
Can AI Read Between The Lines? Benchmarking LLMs On Financial Nuance0
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization0
Can AI Master Construction Management (CM)? Benchmarking State-of-the-Art Large Language Models on CM Certification Exams0
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning0
FedEval: A Holistic Evaluation Framework for Federated Learning0
Can AI Freelancers Compete? Benchmarking Earnings, Reliability, and Task Success at Scale0
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization0
AI-Powered Cow Detection in Complex Farm Environments0
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation0
Temporal cross-validation impacts multivariate time series subsequence anomaly detection evaluation0
Temporal Graphs Anomaly Emergence Detection: Benchmarking For Social Media Interactions0
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks0
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography0
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning0
FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models0
A Benchmark for Spray from Nearby Cutting Vehicles0
CallNavi, A Challenge and Empirical Study on LLM Function Calling and Routing0
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge0
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition0
Temporal Validity Change Prediction0
Call for Action: towards the next generation of symbolic regression benchmark0
FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification0
Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative Machine Learning approaches0
Show:102550
← PrevPage 110 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified