SOTAVerified

Benchmarking

Papers

Showing 13261350 of 5548 papers

TitleStatusHype
KO codes: Inventing Nonlinear Encoding and Decoding for Reliable Wireless Communication via Deep-learningCode1
Comprehensive benchmarking of large language models for RNA secondary structure predictionCode1
Benchmarking Simulation-Based InferenceCode1
LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient LearningCode1
Labelling unlabelled videos from scratch with multi-modal self-supervisionCode1
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning AlgorithmsCode1
A Large-Scale Dataset for Benchmarking Elevator Button Segmentation and Character RecognitionCode1
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity QuantificationCode1
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAMCode1
Benchmarking Spatial Relationships in Text-to-Image GenerationCode1
Quantum machine learning of large datasets using randomized measurementsCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
AudioMarkBench: Benchmarking Robustness of Audio WatermarkingCode1
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRACode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
LEAF: A Benchmark for Federated SettingsCode1
Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality mattersCode1
Benchmarking Image Retrieval for Visual LocalizationCode1
BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway ReasoningCode1
LEMUR Neural Network Dataset: Towards Seamless AutoMLCode1
Less Is More: A Comparison of Active Learning Strategies for 3D Medical Image SegmentationCode1
ArabicaQA: A Comprehensive Dataset for Arabic Question AnsweringCode1
Combinatorial Optimization with Policy Adaptation using Latent Space SearchCode1
Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIsCode1
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
Show:102550
← PrevPage 54 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified