SOTAVerified

Benchmarking

Papers

Showing 15261550 of 5548 papers

TitleStatusHype
Applicability and Challenges of Deep Reinforcement Learning for Satellite Frequency Plan Design0
Apples to Apples: Learning Semantics of Common Entities Through a Novel Comprehension Task0
Benchmarking Foundation Models for Zero-Shot Biometric Tasks0
Benchmarking foundation models as feature extractors for weakly-supervised computational pathology0
Advocating Character Error Rate for Multilingual ASR Evaluation0
Data Analysis in the Era of Generative AI0
Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model0
Benchmarking for Metaheuristic Black-Box Optimization: Perspectives and Open Challenges0
Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles0
Benchmarking for Bayesian Reinforcement Learning0
Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling0
A Platform for Event Extraction in Hindi0
Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework0
Benchmarking fixed-length Fingerprint Representations across different Embedding Sizes and Sensor Types0
Benchmarking five global optimization approaches for nano-optical shape optimization and parameter reconstruction0
DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes0
Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization0
Data and its (dis)contents: A survey of dataset development and use in machine learning research0
Data-driven inventory management for new products: An adjusted Dyna-Q approach with transfer learning0
Benchmarking federated strategies in Peer-to-Peer Federated learning for biomedical data0
Benchmarking Federated Machine Unlearning methods for Tabular Data0
A Pipeline for Post-Crisis Twitter Data Acquisition0
Benchmarking FedAvg and FedCurv for Image Classification Tasks0
A Perspective on Neural Capacity Estimation: Viability and Reliability0
Accelerating the discovery of steady-states of planetary interior dynamics with machine learning0
Show:102550
← PrevPage 62 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified