| Benchmarking Ultra-Low-Power μNPUs | Mar 28, 2025 | Benchmarking | —Unverified | 0 | 0 |
| How Good is a Video Summary? A New Benchmarking Dataset and Evaluation Framework Towards Realistic Video Summarization | Jan 26, 2021 | BenchmarkingSupervised Video Summarization | —Unverified | 0 | 0 |
| Evaluating the Efficacy of Foundational Models: Advancing Benchmarking Practices to Enhance Fine-Tuning Decision-Making | Jun 25, 2024 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| How Good Is Neural Combinatorial Optimization? A Systematic Evaluation on the Traveling Salesman Problem | Sep 22, 2022 | BenchmarkingCombinatorial Optimization | —Unverified | 0 | 0 |
| How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference | May 14, 2025 | Benchmarking | —Unverified | 0 | 0 |
| How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers | Oct 19, 2020 | BenchmarkingGraph Mining | —Unverified | 0 | 0 |
| How Propense Are Large Language Models at Producing Code Smells? A Benchmarking Study | Dec 25, 2024 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| Benchmarking Ultra-High-Definition Image Super-Resolution | Jan 1, 2021 | 4k8k | —Unverified | 0 | 0 |
| The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input | Jan 6, 2025 | BenchmarkingForm | —Unverified | 0 | 0 |
| Benchmarking Twitter Sentiment Analysis Tools | May 1, 2014 | BenchmarkingDecision Making | —Unverified | 0 | 0 |