SOTAVerified

Benchmarking

Papers

Showing 45514575 of 5548 papers

TitleStatusHype
Improvements & Evaluations on the MLCommons CloudMask BenchmarkCode0
The current state of single-cell proteomics data analysisCode0
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context EvaluationCode0
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language GenerationCode0
Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models -- Part IICode0
BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media TextsCode0
LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming LanguagesCode0
Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part ICode0
LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like PostsCode0
Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation ThreadsCode0
BLESS: Benchmarking Large Language Models on Sentence SimplificationCode0
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair PredictionCode0
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image ClassificationCode0
BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media PostsCode0
LLM Performance for Code Generation on Noisy TasksCode0
ImpliRet: Benchmarking the Implicit Fact Retrieval ChallengeCode0
A Dataset for Web-Scale Knowledge Base PopulationCode0
The Devil is in the Prompts: De-Identification Traces Enhance Memorization Risks in Synthetic Chest X-Ray GenerationCode0
Impact of ImageNet Model Selection on Domain AdaptationCode0
Immunofluorescence Capillary Imaging Segmentation: Cases StudyCode0
Analyzing the Feature Extractor Networks for Face Image SynthesisCode0
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity LearningCode0
LLpowershap: Logistic Loss-based Automated Shapley Values Feature Selection MethodCode0
Revisiting and Benchmarking Graph Autoencoders: A Contrastive Learning PerspectiveCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
Show:102550
← PrevPage 183 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified