SOTAVerified

Benchmarking

Papers

Showing 32513275 of 5548 papers

TitleStatusHype
LAMBDA: Covering the Solution Set of Black-Box Inequality by Search Space Quantization0
Landscape-Aware Automated Algorithm Configuration using Multi-output Mixed Regression and Classification0
LanEvil: Benchmarking the Robustness of Lane Detection to Environmental Illusions0
Time Sensitive Knowledge Editing through Efficient Finetuning0
Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance0
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance0
Benchmarking of Transformer-Based Pre-Trained Models on Social Media Text Classification Datasets0
Language Models for Automated Classification of Brain MRI Reports and Growth Chart Generation0
Can LLMs Capture Human Preferences?0
Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles0
TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs0
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines0
Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning0
Understanding Large Language Models in Your Pockets: Performance Study on COTS Mobile Devices0
Benchmarking of LLM Detection: Comparing Two Competing Approaches0
Large Language Models are Null-Shot Learners0
Large Language Models are Few-Shot Clinical Information Extractors0
Large Language Models as Automated Aligners for benchmarking Vision-Language Models0
Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification using ISIC 2017 Dataset0
Adversarially Training for Audio Classifiers0
Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens0
Benchmarking of GPU-optimized Quantum-Inspired Evolutionary Optimization Algorithm using Functional Analysis0
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level0
Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding0
Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models0
Show:102550
← PrevPage 131 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified