SOTAVerified

Benchmarking

Papers

Showing 10511075 of 5548 papers

TitleStatusHype
The CropAndWeed Dataset: A Multi-Modal Learning Approach for Efficient Crop and Weed ManipulationCode1
Trace Encoding in Process Mining: a survey and benchmarkingCode1
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance SegmentationCode1
Benchmarking Robustness of 3D Object Detection to Common CorruptionsCode1
SQAD: Automatic Smartphone Camera Quality Assessment and BenchmarkingCode1
MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUsCode1
Benchmarking Spatial Relationships in Text-to-Image GenerationCode1
A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial AttacksCode1
Benchmarking Robustness of Multimodal Image-Text Models under Distribution ShiftCode1
Benchmarking Large Language Models for Automated Verilog RTL Code GenerationCode1
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch BaselineCode1
Benchmarking Self-Supervised Learning on Diverse Pathology DatasetsCode1
Ego-Body Pose Estimation via Ego-Head Pose EstimationCode1
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design FrameworkCode1
Towards Scene Understanding for Autonomous Operations on Airport ApronsCode1
RLogist: Fast Observation Strategy on Whole-slide Images with Deep Reinforcement LearningCode1
Geoclidean: Few-Shot Generalization in Euclidean GeometryCode1
AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning PotentialsCode1
A Call to Reflect on Evaluation Practices for Failure Detection in Image ClassificationCode1
Multi-Mask Aggregators for Graph Neural NetworksCode1
fseval: A Benchmarking Framework for Feature Selection and Feature Ranking AlgorithmsCode1
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for PolishCode1
PIC4rl-gym: a ROS2 modular framework for Robots Autonomous Navigation with Deep Reinforcement LearningCode1
CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives (full version)Code1
Benchmarking Graph Neural Networks for FMRI analysisCode1
Show:102550
← PrevPage 43 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified