Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2626–2650 of 5548 papers

Title	Date	Tasks	Status	Hype
Temporal Validity Change Prediction	Jan 1, 2024	BenchmarkingPrediction	—Unverified	0
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions	Jan 1, 2024	BenchmarkingInstruction Following	CodeCode Available	1
Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models	Dec 30, 2023	Benchmarkingimage-classification	—Unverified	0
Benchmarking Hebbian learning rules for associative memory	Dec 30, 2023	Benchmarking	—Unverified	0
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA	Dec 29, 2023	AnatomyBenchmarking	CodeCode Available	1
TSPP: A Unified Benchmarking Tool for Time-series Forecasting	Dec 28, 2023	BenchmarkingFeature Engineering	CodeCode Available	0
FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs	Dec 27, 2023	BenchmarkingGPU	CodeCode Available	0
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available	0
Combining SNNs with Filtering for Efficient Neural Decoding in Implantable Brain-Machine Interfaces	Dec 26, 2023	BenchmarkingDecoder	—Unverified	0
RDF-star2Vec: RDF-star Graph Embeddings for Data Mining	Dec 25, 2023	BenchmarkingGraph Embedding	CodeCode Available	0
APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond	Dec 25, 2023	Animal Pose EstimationBenchmarking	CodeCode Available	1
Data needs and challenges for quantum dot devices automation	Dec 21, 2023	Benchmarking	—Unverified	0
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks	Dec 21, 2023	BenchmarkingCommunity Detection	—Unverified	0
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models	Dec 21, 2023	Benchmarking	CodeCode Available	1
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming	Dec 21, 2023	Benchmarkingreinforcement-learning	—Unverified	0
ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks	Dec 21, 2023	Adversarial RobustnessBenchmarking	—Unverified	0
RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation	Dec 21, 2023	BenchmarkingProduct Recommendation	CodeCode Available	1
AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING	Dec 20, 2023	Benchmarking	—Unverified	0
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation	Dec 20, 2023	Benchmarking	—Unverified	0
Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments	Dec 20, 2023	Benchmarking	—Unverified	0
Comparing Machine Learning Algorithms by Union-Free Generic Depth	Dec 20, 2023	Benchmarking	CodeCode Available	0
Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest	Dec 20, 2023	BenchmarkingIn-Context Learning	—Unverified	0
Perception Test 2023: A Summary of the First Challenge And Outcome	Dec 20, 2023	BenchmarkingGrounded Video Question Answering	—Unverified	0
FiFAR: A Fraud Detection Dataset for Learning to Defer	Dec 20, 2023	BenchmarkingDecision Making	CodeCode Available	1
Scaling Compute Is Not All You Need for Adversarial Robustness	Dec 20, 2023	Adversarial RobustnessAll	CodeCode Available	0

Show:10 25 50

← PrevPage 106 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified