SOTAVerified

Benchmarking

Papers

Showing 19261950 of 5548 papers

TitleStatusHype
Challenges in Benchmarking Stream Learning Algorithms with Real-world Data0
CRS Arena: Crowdsourced Benchmarking of Conversational Recommender Systems0
Benchmarking Edge AI Platforms for High-Performance ML Inference0
Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking0
Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation0
CSPO: Cross-Market Synergistic Stock Price Movement Forecasting with Pseudo-volatility Optimization0
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories0
Challenges and perspectives in computational deconvolution of genomics data0
Evaluation of simulation methods for tumor subclonal reconstruction0
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors0
AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING0
Challenges and Advancements in Modeling Shock Fronts with Physics-Informed Neural Networks: A Review and Benchmarking Study0
CubeSat-Enabled Free-Space Optics: Joint Data Communication and Fine Beam Tracking0
Benchmarking End-to-end Learning of MIMO Physical-Layer Communication0
Challenge Results Are Not Reproducible0
A Dataset Similarity Evaluation Framework for Wireless Communications and Sensing0
Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms0
ChakmaNMT: A Low-resource Machine Translation On Chakma Language0
CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models0
Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data0
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning0
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese0
Curse of Slicing: Why Sliced Mutual Information is a Deceptive Measure of Statistical Dependence0
Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm0
C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System0
Show:102550
← PrevPage 78 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified