Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3126–3150 of 5548 papers

Title	Date	Tasks	Status	Hype
Structural Property Prediction	Jul 5, 2023	BenchmarkingPrediction	—Unverified	0
Performance Modeling of Data Storage Systems using Generative Models	Jul 5, 2023	Benchmarking	CodeCode Available	0
Unsupervised Spectral Demosaicing with Lightweight Spectral Attention Networks	Jul 5, 2023	BenchmarkingDemosaicking	—Unverified	0
ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling	Jul 4, 2023	BenchmarkingWeather Forecasting	CodeCode Available	2
OpenSiteRec: An Open Dataset for Site Recommendation	Jul 3, 2023	BenchmarkingInformation Retrieval	—Unverified	0
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms	Jul 3, 2023	BenchmarkingCamera Calibration	—Unverified	0
Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity	Jul 2, 2023	BenchmarkingData Integration	—Unverified	0
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency	Jul 1, 2023	BenchmarkingData Augmentation	—Unverified	0
InstructEval: Systematic Evaluation of Instruction Selection Methods	Jul 1, 2023	BenchmarkingIn-Context Learning	—Unverified	0
Learning Environment Models with Continuous Stochastic Dynamics	Jun 29, 2023	AcrobotBenchmarking	—Unverified	0
Benchmarking Large Language Model Capabilities for Conditional Generation	Jun 29, 2023	BenchmarkingFew-Shot Learning	—Unverified	0
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms	Jun 29, 2023	BenchmarkingRobot Navigation	—Unverified	0
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors	Jun 29, 2023	Benchmarking	—Unverified	0
Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection	Jun 28, 2023	BenchmarkingData Augmentation	CodeCode Available	1
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity	Jun 28, 2023	BenchmarkingImage Captioning	—Unverified	0
Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection	Jun 28, 2023	BenchmarkingDiversity	—Unverified	0
Emotion Analysis of Tweets Banning Education in Afghanistan	Jun 28, 2023	BenchmarkingEmotion Classification	—Unverified	0
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool	Jun 27, 2023	BenchmarkingLanguage Modeling	—Unverified	0
Pulse Shape-Aided Multipath Delay Estimation for Fine-Grained WiFi Sensing	Jun 27, 2023	Benchmarking	—Unverified	0
Benchmarking Stroke Forecasting with Stroke-Level Badminton Dataset	Jun 27, 2023	Benchmarking	—Unverified	0
Enhancing Navigation Benchmarking and Perception Data Generation for Row-based Crops in Simulation	Jun 27, 2023	Autonomous NavigationBenchmarking	—Unverified	0
SCENEREPLICA: Benchmarking Real-World Robot Manipulation by Creating Replicable Scenes	Jun 27, 2023	BenchmarkingMotion Planning	CodeCode Available	1
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback	Jun 26, 2023	BenchmarkingCode Generation	CodeCode Available	2
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards	Jun 25, 2023	BenchmarkingContrastive Learning	—Unverified	0
Hybrid Precoder and Combiner Designs for Decentralized Parameter Estimation in mmWave MIMO Wireless Sensor Networks	Jun 25, 2023	Benchmarkingparameter estimation	—Unverified	0

Show:10 25 50

← PrevPage 126 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified