| Benchmarking the Robustness of UAV Tracking Against Common Corruptions | Mar 18, 2024 | Benchmarking | CodeCode Available | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking | Mar 17, 2024 | BenchmarkingDialogue State Tracking | —Unverified | 0 |
| FlowMind: Automatic Workflow Generation with LLMs | Mar 17, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Depression Detection on Social Media with Large Language Models | Mar 16, 2024 | BenchmarkingDepression Detection | —Unverified | 0 |
| Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks | Mar 15, 2024 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study | Mar 15, 2024 | Benchmarking | CodeCode Available | 0 |
| SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages | Mar 14, 2024 | BenchmarkingDimensionality Reduction | CodeCode Available | 0 |
| Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors | Mar 14, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing Flows | Mar 13, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |