| Wildfire Forecasting with Satellite Images and Deep Generative Model | Aug 19, 2022 | BenchmarkingVideo Prediction | —Unverified | 0 |
| WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences | Jun 16, 2024 | BenchmarkingSpatial Reasoning | —Unverified | 0 |
| Window-of-interest based Multi-objective Evolutionary Search for Satisficing Concepts | Jul 4, 2017 | Benchmarking | —Unverified | 0 |
| WiSoSuper: Benchmarking Super-Resolution Methods on Wind and Solar Data | Sep 17, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Word Complexity Estimation for Japanese Lexical Simplification | May 1, 2020 | BenchmarkingLexical Simplification | —Unverified | 0 |
| WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models | May 14, 2025 | Benchmarking | —Unverified | 0 |
| Writing as a testbed for open ended agents | Mar 25, 2025 | BenchmarkingDiversity | —Unverified | 0 |
| xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods | Feb 5, 2025 | Benchmarking | —Unverified | 0 |
| XCSP3: An Integrated Format for Benchmarking Combinatorial Constrained Problems | Nov 10, 2016 | Benchmarking | —Unverified | 0 |
| XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis | Jun 26, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 |