| Benchmarking VLMs' Reasoning About Persuasive Atypical Images | Sep 16, 2024 | BenchmarkingObject Recognition | —Unverified | 0 |
| Benchmarking Large Language Model Uncertainty for Prompt Optimization | Sep 16, 2024 | BenchmarkingDiversity | CodeCode Available | 0 |
| Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data | Sep 15, 2024 | Benchmarkingtext annotation | —Unverified | 0 |
| Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering | Sep 13, 2024 | BenchmarkingBinary Classification | —Unverified | 0 |
| LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study | Sep 13, 2024 | BenchmarkingGrapheme-to-Phoneme Conversion | —Unverified | 0 |
| Text-To-Speech Synthesis In The Wild | Sep 13, 2024 | BenchmarkingSpeaker Recognition | —Unverified | 0 |
| ODAQ: Open Dataset of Audio Quality - Benchmark on GitHub | Sep 13, 2024 | Audio Quality AssessmentBenchmarking | CodeCode Available | 1 |
| Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning | Sep 12, 2024 | BenchmarkingFairness | —Unverified | 0 |
| Linear energy storage and flexibility model with ramp rate, ramping, deadline and capacity constraints | Sep 12, 2024 | Benchmarking | CodeCode Available | 0 |
| Online vs Offline: A Comparative Study of First-Party and Third-Party Evaluations of Social Chatbots | Sep 12, 2024 | BenchmarkingChatbot | —Unverified | 0 |