Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3401–3450 of 5548 papers

Title	Date	Tasks	Status
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding	Jan 16, 2024	Action RecognitionBenchmarking	—Unverified
A Reinforcement Learning Environment for Directed Quantum Circuit Synthesis	Jan 13, 2024	Benchmarkingreinforcement-learning	—Unverified
Lifelogging As An Extreme Form of Personal Information Management -- What Lessons To Learn	Jan 11, 2024	BenchmarkingForm	—Unverified
Knowledge Sharing in Manufacturing using Large Language Models: User Evaluation and Model Benchmarking	Jan 10, 2024	BenchmarkingInformation Retrieval	—Unverified
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics	Jan 10, 2024	Anomaly SegmentationAutonomous Driving	—Unverified
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset	Jan 9, 2024	Benchmarkingimage-classification	—Unverified
TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models	Jan 9, 2024	Benchmarking	—Unverified
MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation	Jan 9, 2024	BenchmarkingInteractive Segmentation	CodeCode Available
SoK: Systematization and Benchmarking of Deepfake Detectors in a Unified Framework	Jan 9, 2024	BenchmarkingDeepFake Detection	—Unverified
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning	Jan 8, 2024	BenchmarkingCoLA	—Unverified
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking	Jan 8, 2024	BenchmarkingContrastive Learning	—Unverified
NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds	Jan 7, 2024	Autonomous VehiclesBenchmarking	—Unverified
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks	Jan 7, 2024	BenchmarkingGraph Neural Network	CodeCode Available
Using Multi-Temporal Sentinel-1 and Sentinel-2 data for water bodies mapping	Jan 5, 2024	Benchmarking	—Unverified
Benchmarking PathCLIP for Pathology Image Analysis	Jan 5, 2024	BenchmarkingDecision Making	—Unverified
Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNN	Jan 5, 2024	Benchmarking	CodeCode Available
Nodule detection and generation on chest X-rays: NODE21 Challenge	Jan 4, 2024	Benchmarking	—Unverified
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets	Jan 3, 2024	AstronomyBenchmarking	—Unverified
Sheared Backpropagation for Fine-tuning Foundation Models	Jan 1, 2024	Benchmarking	—Unverified
Temporal Validity Change Prediction	Jan 1, 2024	BenchmarkingPrediction	—Unverified
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One	Jan 1, 2024	AllBenchmarking	—Unverified
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures	Jan 1, 2024	BenchmarkingInstance Segmentation	—Unverified
Hyperbolic Anomaly Detection	Jan 1, 2024	Anomaly DetectionBenchmarking	—Unverified
Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos	Jan 1, 2024	Benchmarking	—Unverified
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning	Jan 1, 2024	BenchmarkingFederated Learning	—Unverified
Benchmarking Hebbian learning rules for associative memory	Dec 30, 2023	Benchmarking	—Unverified
Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models	Dec 30, 2023	Benchmarkingimage-classification	—Unverified
TSPP: A Unified Benchmarking Tool for Time-series Forecasting	Dec 28, 2023	BenchmarkingFeature Engineering	CodeCode Available
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available
FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs	Dec 27, 2023	BenchmarkingGPU	CodeCode Available
Combining SNNs with Filtering for Efficient Neural Decoding in Implantable Brain-Machine Interfaces	Dec 26, 2023	BenchmarkingDecoder	—Unverified
RDF-star2Vec: RDF-star Graph Embeddings for Data Mining	Dec 25, 2023	BenchmarkingGraph Embedding	CodeCode Available
Data needs and challenges for quantum dot devices automation	Dec 21, 2023	Benchmarking	—Unverified
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks	Dec 21, 2023	BenchmarkingCommunity Detection	—Unverified
ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks	Dec 21, 2023	Adversarial RobustnessBenchmarking	—Unverified
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming	Dec 21, 2023	Benchmarkingreinforcement-learning	—Unverified
Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest	Dec 20, 2023	BenchmarkingIn-Context Learning	—Unverified
Scaling Compute Is Not All You Need for Adversarial Robustness	Dec 20, 2023	Adversarial RobustnessAll	CodeCode Available
Comparing Machine Learning Algorithms by Union-Free Generic Depth	Dec 20, 2023	Benchmarking	CodeCode Available
Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments	Dec 20, 2023	Benchmarking	—Unverified
Perception Test 2023: A Summary of the First Challenge And Outcome	Dec 20, 2023	BenchmarkingGrounded Video Question Answering	—Unverified
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation	Dec 20, 2023	Benchmarking	—Unverified
AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING	Dec 20, 2023	Benchmarking	—Unverified
MA-BBOB: A Problem Generator for Black-Box Optimization Using Affine Combinations and Shifts	Dec 18, 2023	Benchmarking	—Unverified
QDA^2: A principled approach to automatically annotating charge stability diagrams	Dec 18, 2023	Benchmarking	—Unverified
Bio-Image Informatics Index BIII: A unique database of image analysis tools and workflows for and by the bioimaging community	Dec 18, 2023	Benchmarking	—Unverified
Code Ownership in Open-Source AI Software Security	Dec 18, 2023	Benchmarking	CodeCode Available
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition	Dec 16, 2023	BenchmarkingFacial Expression Recognition	—Unverified
Enabling Accelerators for Graph Computing	Dec 16, 2023	Benchmarking	—Unverified
ChemTime: Rapid and Early Classification for Multivariate Time Series Classification of Chemical Sensors	Dec 15, 2023	BenchmarkingClassification	—Unverified

Show:10 25 50

← PrevPage 69 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified