SOTAVerified

Benchmarking

Papers

Showing 26012625 of 5548 papers

TitleStatusHype
MST: Adaptive Multi-Scale Tokens Guided Interactive SegmentationCode0
TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models0
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset0
SoK: Systematization and Benchmarking of Deepfake Detectors in a Unified Framework0
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning0
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking0
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural NetworksCode0
Segment Anything Model for Medical Image Segmentation: Current Applications and Future DirectionsCode5
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds0
CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital TwinsCode1
Using Multi-Temporal Sentinel-1 and Sentinel-2 data for water bodies mapping0
German Text Embedding Clustering BenchmarkCode1
Benchmarking PathCLIP for Pathology Image Analysis0
Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNNCode0
Nodule detection and generation on chest X-rays: NODE21 Challenge0
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets0
Hyperbolic Anomaly Detection0
Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos0
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One0
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning0
A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified BenchmarkCode2
SEED-Bench: Benchmarking Multimodal Large Language ModelsCode3
Sheared Backpropagation for Fine-tuning Foundation Models0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures0
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language ModelsCode1
Show:102550
← PrevPage 105 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified