SOTAVerified

Benchmarking

Papers

Showing 32763300 of 5548 papers

TitleStatusHype
Lightweight Jet Reconstruction and Identification as an Object Detection Task0
LIM: Large Interpolator Model for Dynamic Reconstruction0
Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models0
Liquid State Genetic Programming0
Livestock Monitoring with Transformer0
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education0
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living0
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation0
LLM-based Evaluation Policy Extraction for Ecological Modeling0
LLM Evaluators Recognize and Favor Their Own Generations0
LLM-initialized Differentiable Causal Discovery0
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation0
LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study0
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection0
LMFormer: Lane based Motion Prediction Transformer0
LMME3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs0
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models0
Load-independent Metrics for Benchmarking Force Controllers0
Local Data Quantity-Aware Weighted Averaging for Federated Learning with Dishonest Clients0
Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture0
Logically at Factify 2022: Multimodal Fact Verification0
Benchmarking Continuous Time Models for Predicting Multiple Sclerosis Progression0
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation0
Long Range Arena : A Benchmark for Efficient Transformers0
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning0
Show:102550
← PrevPage 132 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified