| UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite | Apr 18, 2023 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images | Apr 17, 2023 | 3D Pose EstimationBenchmarking | —Unverified | 0 |
| Towards Computational Performance Engineering for Unsupervised Concept Drift Detection -- Complexities, Benchmarking, Performance Analysis | Apr 17, 2023 | BenchmarkingDrift Detection | CodeCode Available | 0 |
| Dialogue Games for Benchmarking Language Understanding: Motivation, Taxonomy, Strategy | Apr 14, 2023 | Benchmarking | —Unverified | 0 |
| Improving Items and Contexts Understanding with Descriptive Graph for Conversational Recommendation | Apr 11, 2023 | BenchmarkingConversational Recommendation | —Unverified | 0 |
| Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection | Apr 11, 2023 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| OpenAGI: When LLM Meets Domain Experts | Apr 10, 2023 | BenchmarkingNatural Language Queries | CodeCode Available | 4 |
| NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems | Apr 10, 2023 | Benchmarking | CodeCode Available | 1 |
| Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence | Apr 10, 2023 | Benchmarkingspeech-recognition | CodeCode Available | 0 |
| On Evaluation of Bangla Word Analogies | Apr 10, 2023 | BenchmarkingWord Embeddings | —Unverified | 0 |