| Benchmarking LLM Guardrails in Handling Multilingual Toxicity | Oct 29, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 | Apr 22, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| Towards a Unified Framework for Determining Conformational Ensembles of Disordered Proteins | Apr 4, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Towards Benchmarking and Assessing the Safety and Robustness of Autonomous Driving on Safety-critical Scenarios | Mar 31, 2025 | Adversarial AttackAutonomous Driving | —Unverified | 0 | 0 |
| Making Sense of Data in the Wild: Data Analysis Automation at Scale | Jan 27, 2025 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| OrionBench: Benchmarking Time Series Generative Models in the Service of the End-User | Oct 26, 2023 | Anomaly DetectionBenchmarking | —Unverified | 0 | 0 |
| A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks | Apr 30, 2019 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Benchmarking LLM Code Generation for Audio Programming with Visual Dataflow Languages | Sep 1, 2024 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| Benchmarking LiDAR Sensors for Development and Evaluation of Automotive Perception | Apr 28, 2020 | BenchmarkingSystematic Literature Review | —Unverified | 0 | 0 |
| Towards Benchmarking and Evaluating Deepfake Detection | Mar 4, 2022 | BenchmarkingDeepFake Detection | —Unverified | 0 | 0 |