Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2601–2650 of 5548 papers

Title	Date	Tasks	Status	Hype
SoK: Systematization and Benchmarking of Deepfake Detectors in a Unified Framework	Jan 9, 2024	BenchmarkingDeepFake Detection	—Unverified	0
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset	Jan 9, 2024	Benchmarkingimage-classification	—Unverified	0
MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation	Jan 9, 2024	BenchmarkingInteractive Segmentation	CodeCode Available	0
TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models	Jan 9, 2024	Benchmarking	—Unverified	0
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning	Jan 8, 2024	BenchmarkingCoLA	—Unverified	0
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking	Jan 8, 2024	BenchmarkingContrastive Learning	—Unverified	0
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks	Jan 7, 2024	BenchmarkingGraph Neural Network	CodeCode Available	0
Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions	Jan 7, 2024	BenchmarkingImage Segmentation	CodeCode Available	5
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds	Jan 7, 2024	Autonomous VehiclesBenchmarking	—Unverified	0
CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital Twins	Jan 6, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	1
Using Multi-Temporal Sentinel-1 and Sentinel-2 data for water bodies mapping	Jan 5, 2024	Benchmarking	—Unverified	0
German Text Embedding Clustering Benchmark	Jan 5, 2024	BenchmarkingClustering	CodeCode Available	1
Benchmarking PathCLIP for Pathology Image Analysis	Jan 5, 2024	BenchmarkingDecision Making	—Unverified	0
Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNN	Jan 5, 2024	Benchmarking	CodeCode Available	0
Nodule detection and generation on chest X-rays: NODE21 Challenge	Jan 4, 2024	Benchmarking	—Unverified	0
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets	Jan 3, 2024	AstronomyBenchmarking	—Unverified	0
Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos	Jan 1, 2024	Benchmarking	—Unverified	0
Hyperbolic Anomaly Detection	Jan 1, 2024	Anomaly DetectionBenchmarking	—Unverified	0
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One	Jan 1, 2024	AllBenchmarking	—Unverified	0
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning	Jan 1, 2024	BenchmarkingFederated Learning	—Unverified	0
A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark	Jan 1, 2024	Age EstimationBenchmarking	CodeCode Available	2
Sheared Backpropagation for Fine-tuning Foundation Models	Jan 1, 2024	Benchmarking	—Unverified	0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures	Jan 1, 2024	BenchmarkingInstance Segmentation	—Unverified	0
SEED-Bench: Benchmarking Multimodal Large Language Models	Jan 1, 2024	BenchmarkingImage Generation	CodeCode Available	3
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models	Jan 1, 2024	Benchmarking	CodeCode Available	1
Temporal Validity Change Prediction	Jan 1, 2024	BenchmarkingPrediction	—Unverified	0
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions	Jan 1, 2024	BenchmarkingInstruction Following	CodeCode Available	1
Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models	Dec 30, 2023	Benchmarkingimage-classification	—Unverified	0
Benchmarking Hebbian learning rules for associative memory	Dec 30, 2023	Benchmarking	—Unverified	0
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA	Dec 29, 2023	AnatomyBenchmarking	CodeCode Available	1
TSPP: A Unified Benchmarking Tool for Time-series Forecasting	Dec 28, 2023	BenchmarkingFeature Engineering	CodeCode Available	0
FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs	Dec 27, 2023	BenchmarkingGPU	CodeCode Available	0
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available	0
Combining SNNs with Filtering for Efficient Neural Decoding in Implantable Brain-Machine Interfaces	Dec 26, 2023	BenchmarkingDecoder	—Unverified	0
RDF-star2Vec: RDF-star Graph Embeddings for Data Mining	Dec 25, 2023	BenchmarkingGraph Embedding	CodeCode Available	0
APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond	Dec 25, 2023	Animal Pose EstimationBenchmarking	CodeCode Available	1
Data needs and challenges for quantum dot devices automation	Dec 21, 2023	Benchmarking	—Unverified	0
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks	Dec 21, 2023	BenchmarkingCommunity Detection	—Unverified	0
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models	Dec 21, 2023	Benchmarking	CodeCode Available	1
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming	Dec 21, 2023	Benchmarkingreinforcement-learning	—Unverified	0
ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks	Dec 21, 2023	Adversarial RobustnessBenchmarking	—Unverified	0
RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation	Dec 21, 2023	BenchmarkingProduct Recommendation	CodeCode Available	1
AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING	Dec 20, 2023	Benchmarking	—Unverified	0
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation	Dec 20, 2023	Benchmarking	—Unverified	0
Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments	Dec 20, 2023	Benchmarking	—Unverified	0
Comparing Machine Learning Algorithms by Union-Free Generic Depth	Dec 20, 2023	Benchmarking	CodeCode Available	0
Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest	Dec 20, 2023	BenchmarkingIn-Context Learning	—Unverified	0
Perception Test 2023: A Summary of the First Challenge And Outcome	Dec 20, 2023	BenchmarkingGrounded Video Question Answering	—Unverified	0
FiFAR: A Fraud Detection Dataset for Learning to Defer	Dec 20, 2023	BenchmarkingDecision Making	CodeCode Available	1
Scaling Compute Is Not All You Need for Adversarial Robustness	Dec 20, 2023	Adversarial RobustnessAll	CodeCode Available	0

Show:10 25 50

← PrevPage 53 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified