Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2676–2700 of 5548 papers

Title	Date	Tasks	Status	Hype
Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning	Dec 10, 2023	Autonomous VehiclesBenchmarking	—Unverified	0
Forecasting Lithium-Ion Battery Longevity with Limited Data Availability: Benchmarking Different Machine Learning Algorithms	Dec 10, 2023	Battery cycle life predictionBenchmarking	—Unverified	0
Benchmarking of Query Strategies: Towards Future Deep Active Learning	Dec 10, 2023	Active LearningBenchmarking	CodeCode Available	0
STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers	Dec 9, 2023	AnatomyAutoML	CodeCode Available	1
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified	0
Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single Images	Dec 8, 2023	BenchmarkingObject	CodeCode Available	1
Perspectives on the State and Future of Deep Learning -- 2023	Dec 7, 2023	BenchmarkingDeep Learning	—Unverified	0
Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?	Dec 7, 2023	BenchmarkingDiversity	—Unverified	0
Pearl: A Production-ready Reinforcement Learning Agent	Dec 6, 2023	Benchmarkingreinforcement-learning	CodeCode Available	4
Benchmarking Continual Learning from Cognitive Perspectives	Dec 6, 2023	BenchmarkingContinual Learning	—Unverified	0
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym	Dec 6, 2023	BenchmarkingDecision Making	CodeCode Available	1
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	0
Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking Technique	Dec 6, 2023	BenchmarkingKnowledge Graphs	CodeCode Available	0
Liquid State Genetic Programming	Dec 5, 2023	Benchmarking	—Unverified	0
Semi-implicit Continuous Newton Method for Power Flow Analysis	Dec 5, 2023	BenchmarkingNumerical Integration	—Unverified	0
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World	Dec 5, 2023	BenchmarkingDiversity	—Unverified	0
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models	Dec 5, 2023	BenchmarkingVisual Question Answering	CodeCode Available	1
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks	Dec 5, 2023	BenchmarkingMinecraft	CodeCode Available	1
Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions	Dec 5, 2023	BenchmarkingConversational Question Answering	CodeCode Available	1
Contrastive Learning-Based Spectral Knowledge Distillation for Multi-Modality and Missing Modality Scenarios in Semantic Segmentation	Dec 4, 2023	BenchmarkingContrastive Learning	—Unverified	0
BenchMARL: Benchmarking Multi-Agent Reinforcement Learning	Dec 3, 2023	BenchmarkingMulti-agent Reinforcement Learning	—Unverified	0
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets	Dec 2, 2023	Benchmarking	—Unverified	0
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation	Dec 2, 2023	Benchmarking	—Unverified	0
Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024 Election Ahead of Time	Dec 1, 2023	ArticlesBenchmarking	—Unverified	0
Identifying patterns and recommendations of and for sustainable open data initiatives: a benchmarking-driven analysis of open government data initiatives among European countries	Dec 1, 2023	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 108 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified