SOTAVerified

Benchmarking

Papers

Showing 26012650 of 5548 papers

TitleStatusHype
Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval0
High Fidelity RF Clutter Modeling and Simulation0
FineText: Text Classification via Attention-based Language Model Fine-tuning0
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems0
Fine-tuning LLaMA 2 interference: a comparative study of language implementations for optimal efficiency0
FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets0
FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets0
Benchmarking real-time monitoring strategies for ethanol production from lignocellulosic biomass0
High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing0
FIORD: A Fisheye Indoor-Outdoor Dataset with LIDAR Ground Truth for 3D Scene Reconstruction and Benchmarking0
Feasibility of BERT Embeddings For Domain-Specific Knowledge Mining0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures0
Benchmarking real-time algorithms for in-phase auditory stimulation of low amplitude slow waves with wearable EEG devices during sleep0
FixCLR: Negative-Class Contrastive Learning for Semi-Supervised Domain Generalization0
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration0
FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems0
Benchmarking Randomized Optimization Algorithms on Binary, Permutation, and Combinatorial Problem Landscapes0
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning0
FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents0
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition0
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models0
FlowMind: Automatic Workflow Generation with LLMs0
Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
Fast Training of Deep Networks with One-Class CNNs0
AI-ready Snow Radar Echogram Dataset (SRED) for climate change monitoring0
A Comprehensive Benchmarking Platform for Deep Generative Models in Molecular Design0
High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks0
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning0
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects0
Fast Labeling and Transcription with the Speechalyzer Toolkit0
Benchmarking Quantum Hardware for Training of Fully Visible Boltzmann Machines0
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation0
Fast Empirical Scenarios0
Benchmarking Quantum Convolutional Neural Networks for Signal Classification in Simulated Gamma-Ray Burst Detection0
A Survey on Model Compression for Large Language Models0
FastDraft: How to Train Your Draft0
Forecasting NIFTY 50 benchmark Index using Seasonal ARIMA time series models0
AI-Powered Cow Detection in Complex Farm Environments0
Benchmarking quantized LLaMa-based models on the Brazilian Secondary School Exam0
Fast, approximate kinetics of RNA folding0
A Survey on Masked Facial Detection Methods and Datasets for Fighting Against COVID-190
Hide and Seek: on the Stealthiness of Attacks against Deep Learning Systems0
Formal Covariate Benchmarking to Bound Omitted Variable Bias0
Hiding in Plain Sight: Reframing Hardware Trojan Benchmarking as a Hide&Seek Modification0
Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning0
FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents0
Benchmarking Quality-Dependent and Cost-Sensitive Score-Level Multimodal Biometric Fusion Algorithms0
FarsBase-KBP: A Knowledge Base Population System for the Persian Knowledge Graph0
Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension0
Show:102550
← PrevPage 53 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified