| GPTs and Language Barrier: A Cross-Lingual Legal QA Examination | Mar 26, 2024 | ArticlesBenchmarking | —Unverified | 0 | 0 |
| Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models | Apr 14, 2025 | BenchmarkingDescriptive | —Unverified | 0 | 0 |
| Beyond Black-Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems | Mar 9, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Variational Laplace for Bayesian neural networks | Nov 20, 2020 | BenchmarkingVariational Inference | —Unverified | 0 | 0 |
| Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities | May 13, 2025 | automatic-speech-translationBenchmarking | —Unverified | 0 | 0 |
| Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking | Mar 17, 2024 | BenchmarkingDialogue State Tracking | —Unverified | 0 | 0 |
| Graph Alignment for Benchmarking Graph Neural Networks and Learning Positional Encodings | May 19, 2025 | BenchmarkingCombinatorial Optimization | —Unverified | 0 | 0 |
| Beyond Benchmarks: On The False Promise of AI Regulation | Jan 26, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Graph Attention-based Decentralized Actor-Critic for Dual-Objective Control of Multi-UAV Swarms | Jun 10, 2025 | BenchmarkingGraph Attention | —Unverified | 0 | 0 |
| Graph-based Deep-Tree Recursive Neural Network (DTRNN) for Text Classification | Sep 4, 2018 | BenchmarkingGeneral Classification | —Unverified | 0 | 0 |