| ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies | Apr 28, 2025 | BenchmarkingData Augmentation | —Unverified | 0 |
| BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text | Apr 28, 2025 | Benchmarking | CodeCode Available | 1 |
| BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese | Apr 27, 2025 | BenchmarkingProper Noun | CodeCode Available | 2 |
| Quantitative evaluation of brain-inspired vision sensors in high-speed robotic perception | Apr 27, 2025 | BenchmarkingEvent-based vision | —Unverified | 0 |
| The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach | Apr 27, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 |
| Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis | Apr 25, 2025 | Benchmarking | —Unverified | 0 |
| Token Sequence Compression for Efficient Multimodal Computing | Apr 24, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency | Apr 24, 2025 | BenchmarkingMath | CodeCode Available | 1 |
| Design and benchmarking of a two degree of freedom tendon driver unit for cable-driven wearable technologies | Apr 24, 2025 | Benchmarking | —Unverified | 0 |