SOTAVerified

Benchmarking

Papers

Showing 22512300 of 5548 papers

TitleStatusHype
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation FrameworkCode0
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A SurveyCode0
How Far Are We from Optimal Reasoning Efficiency?Code0
Benchmarking Multimodal CoT Reward Model Stepwise by Visual ProgramCode0
A Seq2Seq approach to Symbolic RegressionCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
Benchmarking Multilabel Topic Classification in the Kyrgyz LanguageCode0
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop ReasoningCode0
A Continuous Optimisation Benchmark Suite from Neural Network RegressionCode0
Benchmarking multi-component signal processing methods in the time-frequency planeCode0
HOEG: A New Approach for Object-Centric Predictive Process MonitoringCode0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
Aggregated Attributions for Explanatory Analysis of 3D Segmentation ModelsCode0
Benchmarking MOEAs for solving continuous multi-objective RL problemsCode0
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE CorpusCode0
Benchmarking Model-Based Reinforcement LearningCode0
Benchmarking Misuse Mitigation Against Covert AdversariesCode0
Benchmarking missing-values approaches for predictive models on health databasesCode0
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsCode0
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image ClassificationCode0
Benchmarking Minimax LinkageCode0
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability predictionCode0
EFSA: Towards Event-Level Financial Sentiment AnalysisCode0
Efficient, Uncertainty-based Moderation of Neural Networks Text ClassifiersCode0
Harnessing Orthogonality to Train Low-Rank Neural NetworksCode0
HATE-ITA: New Baselines for Hate Speech Detection in ItalianCode0
A Comparative Analysis of Word-Level Metric Differential Privacy: Benchmarking The Privacy-Utility Trade-offCode0
Efficient Performance Tracking: Leveraging Large Language Models for Automated Construction of Scientific LeaderboardsCode0
Hardware Aware Neural Network Architectures using FbNetCode0
Heterogeneous Datasets for Federated Survival Analysis SimulationCode0
Efficiently solving the thief orienteering problem with a max-min ant colony optimization approachCode0
Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence ClassificationCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device ScenariosCode0
Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time AppsCode0
Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesCode0
Grounded Intuition of GPT-Vision's Abilities with Scientific ImagesCode0
Efficient and Effective Model ExtractionCode0
Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate GradientsCode0
Efficient Realistic Data Generation Framework leveraging Deep Learning-based Human DigitizationCode0
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIsCode0
Harmonization Benchmarking Tool for Neuroimaging DatasetsCode0
Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes ProsthesisCode0
Graph-theoretical approach to robust 3D normal extraction of LiDAR dataCode0
GRATIS: GeneRAting TIme Series with diverse and controllable characteristicsCode0
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document CorporaCode0
Hard-Label Cryptanalytic Extraction of Neural Network ModelsCode0
Hi-EF: Benchmarking Emotion Forecasting in Human-interactionCode0
Learning Conjoint Attentions for Graph Neural NetsCode0
Graph Convolutional Networks Meet with High Dimensionality ReductionCode0
Show:102550
← PrevPage 46 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified