| ASI: Accuracy-Stability Index for Evaluating Deep Learning Models | Nov 26, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification | Nov 24, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Benchmarking Robustness of Text-Image Composed Retrieval | Nov 24, 2023 | AttributeBenchmarking | CodeCode Available | 1 |
| Large Language Models as Automated Aligners for benchmarking Vision-Language Models | Nov 24, 2023 | BenchmarkingWorld Knowledge | —Unverified | 0 |
| Dialogue Quality and Emotion Annotations for Customer Support Conversations | Nov 23, 2023 | BenchmarkingDiversity | CodeCode Available | 0 |
| Learning Dynamic Selection and Pricing of Out-of-Home Deliveries | Nov 23, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 |
| Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI | Nov 23, 2023 | BenchmarkingCloud Detection | CodeCode Available | 0 |
| Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN) | Nov 23, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| PG-Video-LLaVA: Pixel Grounding Large Video-Language Models | Nov 22, 2023 | BenchmarkingPhrase Grounding | CodeCode Available | 2 |
| Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning | Nov 22, 2023 | BenchmarkingDrug Discovery | —Unverified | 0 |