Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2601–2625 of 5548 papers

Title	Date	Tasks	Status	Hype
MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation	Jan 9, 2024	BenchmarkingInteractive Segmentation	CodeCode Available	0
TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models	Jan 9, 2024	Benchmarking	—Unverified	0
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset	Jan 9, 2024	Benchmarkingimage-classification	—Unverified	0
SoK: Systematization and Benchmarking of Deepfake Detectors in a Unified Framework	Jan 9, 2024	BenchmarkingDeepFake Detection	—Unverified	0
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning	Jan 8, 2024	BenchmarkingCoLA	—Unverified	0
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking	Jan 8, 2024	BenchmarkingContrastive Learning	—Unverified	0
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks	Jan 7, 2024	BenchmarkingGraph Neural Network	CodeCode Available	0
Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions	Jan 7, 2024	BenchmarkingImage Segmentation	CodeCode Available	5
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds	Jan 7, 2024	Autonomous VehiclesBenchmarking	—Unverified	0
CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital Twins	Jan 6, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	1
Using Multi-Temporal Sentinel-1 and Sentinel-2 data for water bodies mapping	Jan 5, 2024	Benchmarking	—Unverified	0
German Text Embedding Clustering Benchmark	Jan 5, 2024	BenchmarkingClustering	CodeCode Available	1
Benchmarking PathCLIP for Pathology Image Analysis	Jan 5, 2024	BenchmarkingDecision Making	—Unverified	0
Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNN	Jan 5, 2024	Benchmarking	CodeCode Available	0
Nodule detection and generation on chest X-rays: NODE21 Challenge	Jan 4, 2024	Benchmarking	—Unverified	0
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets	Jan 3, 2024	AstronomyBenchmarking	—Unverified	0
Hyperbolic Anomaly Detection	Jan 1, 2024	Anomaly DetectionBenchmarking	—Unverified	0
Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos	Jan 1, 2024	Benchmarking	—Unverified	0
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One	Jan 1, 2024	AllBenchmarking	—Unverified	0
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning	Jan 1, 2024	BenchmarkingFederated Learning	—Unverified	0
A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark	Jan 1, 2024	Age EstimationBenchmarking	CodeCode Available	2
SEED-Bench: Benchmarking Multimodal Large Language Models	Jan 1, 2024	BenchmarkingImage Generation	CodeCode Available	3
Sheared Backpropagation for Fine-tuning Foundation Models	Jan 1, 2024	Benchmarking	—Unverified	0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures	Jan 1, 2024	BenchmarkingInstance Segmentation	—Unverified	0
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models	Jan 1, 2024	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 105 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified