SOTAVerified

Benchmarking

Papers

Showing 26762700 of 5548 papers

TitleStatusHype
Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning0
Forecasting Lithium-Ion Battery Longevity with Limited Data Availability: Benchmarking Different Machine Learning Algorithms0
Benchmarking of Query Strategies: Towards Future Deep Active LearningCode0
STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep CentersCode1
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis0
Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single ImagesCode1
Perspectives on the State and Future of Deep Learning -- 20230
Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?0
Pearl: A Production-ready Reinforcement Learning AgentCode4
Benchmarking Continual Learning from Cognitive Perspectives0
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
KhabarChin: Automatic Detection of Important News in the Persian LanguageCode0
Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking TechniqueCode0
Liquid State Genetic Programming0
Semi-implicit Continuous Newton Method for Power Flow Analysis0
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World0
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal ModelsCode1
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy TasksCode1
Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM InteractionsCode1
Contrastive Learning-Based Spectral Knowledge Distillation for Multi-Modality and Missing Modality Scenarios in Semantic Segmentation0
BenchMARL: Benchmarking Multi-Agent Reinforcement Learning0
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets0
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation0
Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024 Election Ahead of Time0
Identifying patterns and recommendations of and for sustainable open data initiatives: a benchmarking-driven analysis of open government data initiatives among European countries0
Show:102550
← PrevPage 108 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified