SOTAVerified

Benchmarking

Papers

Showing 15011550 of 5548 papers

TitleStatusHype
RUHSNet: 3D Object Detection Using Lidar Data in Real TimeCode0
Learning to Transfer for Traffic Forecasting via Multi-task LearningCode0
Learning from Integral Losses in Physics Informed Neural NetworksCode0
Benchmarking Generative Latent Variable Models for SpeechCode0
Benchmarking Generative AI Models for Deep Learning Test Input GenerationCode0
Learning protein constitutive motifs from sequence dataCode0
Learning collective multi-cellular dynamics from temporal scRNA-seq via a transformer-enhanced Neural SDECode0
Using representation balancing to learn conditional-average dose responses from clustered dataCode0
Learning an Event Sequence Embedding for Dense Event-Based Deep StereoCode0
Learning Dynamic Selection and Pricing of Out-of-Home DeliveriesCode0
Learning Quantum Processes with Quantum Statistical QueriesCode0
Benchmarking Framework for Performance-Evaluation of Causal Inference AnalysisCode0
Learned Bayesian Cramér-Rao Bound for Unknown Measurement Models Using Score Neural NetworksCode0
Benchmarking framework for machine learning classification from fNIRS dataCode0
Learnability and Complexity of Quantum SamplesCode0
Learned Sorted Table Search and Static Indexes in Small Model SpaceCode0
Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and ValidationCode0
Leak Proof CMap; a framework for training and evaluation of cell line agnostic L1000 similarity methodsCode0
Learn How to Query from Unlabeled Data Streams in Federated LearningCode0
A Position Paper on the Automatic Generation of Machine Learning LeaderboardsCode0
Large-scale Ridesharing DARP Instances Based on Real Travel DemandCode0
ADVIO: An authentic dataset for visual-inertial odometryCode0
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey beesCode0
Benchmarking Flexible Electric Loads Scheduling Algorithms under Market Price UncertaintyCode0
Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NASCode0
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and ChallengesCode0
Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?Code0
Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual TrackingCode0
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency PredictionCode0
Benchmarking Federated Learning for Semantic Datasets: Federated Scene Graph GenerationCode0
Laparoscopic Image Desmoking Using the U-Net with New Loss Function and Integrated Differentiable Wiener FilterCode0
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methodsCode0
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive SegmentationCode0
LaCViT: A Label-aware Contrastive Fine-tuning Framework for Vision TransformersCode0
Adversarial Metric Attack and Defense for Person Re-identificationCode0
Language-based Image Colorization: A Benchmark and BeyondCode0
LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG RoutingCode0
Benchmarking Feature-based Algorithm Selection Systems for Black-box Numerical OptimizationCode0
Benchmarking Failures in Tool-Augmented Language ModelsCode0
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regionsCode0
Knowledge Enhanced Conditional Imputation for Healthcare Time-seriesCode0
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense ScenariosCode0
Towards Enhancing Fault Tolerance in Neural NetworksCode0
KhabarChin: Automatic Detection of Important News in the Persian LanguageCode0
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World KnowledgeCode0
Ants can orienteer a thief in their robberyCode0
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User ManualsCode0
Benchmarking Educational Program RepairCode0
ANTHROPOS-V: benchmarking the novel task of Crowd Volume EstimationCode0
Adversarial Environment Generation for Learning to Navigate the WebCode0
Show:102550
← PrevPage 31 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified