SOTAVerified

Benchmarking

Papers

Showing 32513275 of 5548 papers

TitleStatusHype
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects0
HySpecNet-11k: A Large-Scale Hyperspectral Dataset for Benchmarking Learning-Based Hyperspectral Image Compression Methods0
Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesCode0
Accurate and Efficient Structural Ensemble Generation of Macrocyclic Peptides using Internal Coordinate DiffusionCode1
SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsCode1
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context LearningCode0
Design and implementation of intelligent packet filtering in IoT microcontroller-based devicesCode0
ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden StatesCode0
Large-scale Ridesharing DARP Instances Based on Real Travel DemandCode0
IDToolkit: A Toolkit for Benchmarking and Developing Inverse Design Algorithms in NanophotonicsCode1
Human Body Shape Classification Based on a Single Image0
Decoding the Underlying Meaning of Multimodal Hateful MemesCode1
InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual IllusionCode0
Exploring the Practicality of Generative Retrieval on Dynamic Corpora0
BASED: Benchmarking, Analysis, and Structural Estimation of DeblurringCode0
Learning from Integral Losses in Physics Informed Neural NetworksCode0
Benchmarking Diverse-Modal Entity Linking with Generative Models0
The Brain Tumor Segmentation (BraTS) Challenge 2023: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs)Code2
Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial TasksCode1
Benchmarking state-of-the-art gradient boosting algorithms for classification0
Investigation of UAV Detection in Images with Complex Backgrounds and Rainy ArtifactsCode0
CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical DatasetCode0
KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range MultilaterationCode1
Analysis of modular CMA-ES on strict box-constrained problems in the SBOX-COST benchmarking suite0
Barkour: Benchmarking Animal-level Agility with Quadruped Robots0
Show:102550
← PrevPage 131 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified