| UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions | Jun 18, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 0 |
| BED: Bi-Encoder-Based Detectors for Out-of-Distribution Detection | Jun 15, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 0 |
| Replicable Benchmarking of Neural Machine Translation (NMT) on Low-Resource Local Languages in Indonesia | Nov 2, 2023 | BenchmarkingMachine Translation | CodeCode Available | 0 |
| RUHSNet: 3D Object Detection Using Lidar Data in Real Time | May 9, 2020 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 0 |
| Replication Study and Benchmarking of Real-Time Object Detection Models | May 11, 2024 | Benchmarkingobject-detection | CodeCode Available | 0 |
| IPC: A Benchmark Data Set for Learning with Graph-Structured Data | May 15, 2019 | BenchmarkingGraph Classification | CodeCode Available | 0 |
| RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content | Jun 17, 2024 | BenchmarkingGeneral Knowledge | CodeCode Available | 0 |
| Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark | May 9, 2016 | BenchmarkingEmotion Recognition | CodeCode Available | 0 |
| IoT Data Trust Evaluation via Machine Learning | Aug 15, 2023 | BenchmarkingTime Series | CodeCode Available | 0 |
| Representation Learning of Limit Order Book: A Comprehensive Study and Benchmarking | May 4, 2025 | BenchmarkingRepresentation Learning | CodeCode Available | 0 |