Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2651–2700 of 5548 papers

Title	Date	Tasks	Status	Hype
TAO-Amodal: A Benchmark for Tracking Any Object Amodally	Dec 19, 2023	Amodal TrackingAutonomous Driving	CodeCode Available	1
Bio-Image Informatics Index BIII: A unique database of image analysis tools and workflows for and by the bioimaging community	Dec 18, 2023	Benchmarking	—Unverified	0
MA-BBOB: A Problem Generator for Black-Box Optimization Using Affine Combinations and Shifts	Dec 18, 2023	Benchmarking	—Unverified	0
QDA^2: A principled approach to automatically annotating charge stability diagrams	Dec 18, 2023	Benchmarking	—Unverified	0
Code Ownership in Open-Source AI Software Security	Dec 18, 2023	Benchmarking	CodeCode Available	0
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition	Dec 16, 2023	BenchmarkingFacial Expression Recognition	—Unverified	0
How to Train Neural Field Representations: A Comprehensive Study and Benchmark	Dec 16, 2023	Benchmarking	CodeCode Available	1
Enabling Accelerators for Graph Computing	Dec 16, 2023	Benchmarking	—Unverified	0
A Novel Hybrid Ordinal Learning Model with Health Care Application	Dec 15, 2023	BenchmarkingDiagnostic	—Unverified	0
ChemTime: Rapid and Early Classification for Multivariate Time Series Classification of Chemical Sensors	Dec 15, 2023	BenchmarkingClassification	—Unverified	0
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models	Dec 15, 2023	BenchmarkingCode Summarization	CodeCode Available	1
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration	Dec 14, 2023	BenchmarkingPoint Cloud Registration	—Unverified	0
Efficiently Quantifying Individual Agent Importance in Cooperative MARL	Dec 13, 2023	BenchmarkingMulti-agent Reinforcement Learning	—Unverified	0
EventAid: Benchmarking Event-aided Image/Video Enhancement Algorithms with Real-captured Hybrid Dataset	Dec 13, 2023	BenchmarkingDeblurring	—Unverified	0
Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation	Dec 12, 2023	BenchmarkingColumns Property Annotation	—Unverified	0
Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition	Dec 12, 2023	BenchmarkingDeep Learning	—Unverified	0
Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images	Dec 12, 2023	BenchmarkingRetrieval	—Unverified	0
Meta-survey on outlier and anomaly detection	Dec 12, 2023	Anomaly DetectionBenchmarking	CodeCode Available	0
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation	Dec 12, 2023	Anomaly DetectionAutonomous Driving	CodeCode Available	1
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning	Dec 11, 2023	BenchmarkingHuman-Object Interaction Detection	CodeCode Available	1
Implementing hosting capacity analysis in distribution networks: Practical considerations, advancements and future directions	Dec 11, 2023	BenchmarkingCapacity Estimation	—Unverified	0
Cataract-1K: Cataract Surgery Dataset for Scene Segmentation, Phase Recognition, and Irregularity Detection	Dec 11, 2023	BenchmarkingDomain Adaptation	—Unverified	0
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models	Dec 11, 2023	BenchmarkingEmotional Intelligence	CodeCode Available	2
Benchmarking Distribution Shift in Tabular Data with TableShift	Dec 10, 2023	BenchmarkingBinary Classification	CodeCode Available	1
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One	Dec 10, 2023	AllBenchmarking	CodeCode Available	3
Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning	Dec 10, 2023	Autonomous VehiclesBenchmarking	—Unverified	0
Forecasting Lithium-Ion Battery Longevity with Limited Data Availability: Benchmarking Different Machine Learning Algorithms	Dec 10, 2023	Battery cycle life predictionBenchmarking	—Unverified	0
Benchmarking of Query Strategies: Towards Future Deep Active Learning	Dec 10, 2023	Active LearningBenchmarking	CodeCode Available	0
STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers	Dec 9, 2023	AnatomyAutoML	CodeCode Available	1
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified	0
Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single Images	Dec 8, 2023	BenchmarkingObject	CodeCode Available	1
Perspectives on the State and Future of Deep Learning -- 2023	Dec 7, 2023	BenchmarkingDeep Learning	—Unverified	0
Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?	Dec 7, 2023	BenchmarkingDiversity	—Unverified	0
Pearl: A Production-ready Reinforcement Learning Agent	Dec 6, 2023	Benchmarkingreinforcement-learning	CodeCode Available	4
Benchmarking Continual Learning from Cognitive Perspectives	Dec 6, 2023	BenchmarkingContinual Learning	—Unverified	0
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym	Dec 6, 2023	BenchmarkingDecision Making	CodeCode Available	1
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	0
Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking Technique	Dec 6, 2023	BenchmarkingKnowledge Graphs	CodeCode Available	0
Liquid State Genetic Programming	Dec 5, 2023	Benchmarking	—Unverified	0
Semi-implicit Continuous Newton Method for Power Flow Analysis	Dec 5, 2023	BenchmarkingNumerical Integration	—Unverified	0
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World	Dec 5, 2023	BenchmarkingDiversity	—Unverified	0
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models	Dec 5, 2023	BenchmarkingVisual Question Answering	CodeCode Available	1
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks	Dec 5, 2023	BenchmarkingMinecraft	CodeCode Available	1
Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions	Dec 5, 2023	BenchmarkingConversational Question Answering	CodeCode Available	1
Contrastive Learning-Based Spectral Knowledge Distillation for Multi-Modality and Missing Modality Scenarios in Semantic Segmentation	Dec 4, 2023	BenchmarkingContrastive Learning	—Unverified	0
BenchMARL: Benchmarking Multi-Agent Reinforcement Learning	Dec 3, 2023	BenchmarkingMulti-agent Reinforcement Learning	—Unverified	0
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets	Dec 2, 2023	Benchmarking	—Unverified	0
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation	Dec 2, 2023	Benchmarking	—Unverified	0
Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024 Election Ahead of Time	Dec 1, 2023	ArticlesBenchmarking	—Unverified	0
Identifying patterns and recommendations of and for sustainable open data initiatives: a benchmarking-driven analysis of open government data initiatives among European countries	Dec 1, 2023	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 54 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified