SOTAVerified

Benchmarking

Papers

Showing 33513400 of 5548 papers

TitleStatusHype
Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study0
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks0
Leveraging State Space Models in Long Range Genomics0
Break a Lag: Triple Exponential Moving Average for Enhanced Optimization0
LEXam: Benchmarking Legal Reasoning on 340 Law Exams0
Benchmarking network fabrics for data distributed training of deep neural networks0
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing0
Benchmarking Named Entity Disambiguation approaches for Streaming Graphs0
Benchmarking Mutual Information-based Loss Functions in Federated Learning0
Benchmarking Music Generation Models and Metrics via Human Preference Studies0
Top-k Regularization for Supervised Feature Selection0
LIBRE: The Multiple 3D LiDAR Dataset0
LidarGait: Benchmarking 3D Gait Recognition with Point Clouds0
Lifelogging As An Extreme Form of Personal Information Management -- What Lessons To Learn0
Benchmarking Multivariate Time Series Classification Algorithms0
Light Field Image Quality Assessment With Auxiliary Learning Based on Depthwise and Anglewise Separable Convolutions0
Advances in Preference-based Reinforcement Learning: A Review0
Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI0
Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice0
Lightning UQ Box: A Comprehensive Framework for Uncertainty Quantification in Deep Learning0
Lightweight Jet Reconstruction and Identification as an Object Detection Task0
Solving excited states for long-range interacting trapped ions with neural networks0
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection0
Benchmarking Multi-National Value Alignment for Large Language Models0
LIM: Large Interpolator Model for Dynamic Reconstruction0
Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization0
Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models0
Liquid State Genetic Programming0
Livestock Monitoring with Transformer0
Benchmarking Multimodal Sentiment Analysis0
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education0
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living0
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation0
Benchmarking Multimodal Regex Synthesis with Complex Structures0
LLM-based Evaluation Policy Extraction for Ecological Modeling0
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures0
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains0
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration0
Benchmarking Multimodal Models for Fine-Grained Image Analysis: A Comparative Study Across Diverse Visual Features0
LLM Evaluators Recognize and Favor Their Own Generations0
Benchmarking Multimodal LLMs on Recognition and Understanding over Chemical Tables0
Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 20150
LLM-initialized Differentiable Causal Discovery0
Totally Corrective Boosting with Cardinality Penalization0
Benchmarking Multi-Domain Active Learning on Image Classification0
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation0
LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study0
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming0
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms0
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection0
Show:102550
← PrevPage 68 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified