| Kimi-Audio Technical Report | Apr 25, 2025 | Audio Question AnsweringQuestion Answering | CodeCode Available | 7 | 5 |
| Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering | Mar 14, 2025 | Audio Question AnsweringQuestion Answering | CodeCode Available | 3 | 5 |
| ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | May 18, 2023 | 1 Image, 2*2 StitchiAction Classification | CodeCode Available | 3 | 5 |
| Pengi: An Audio Language Model for Audio Tasks | May 19, 2023 | Audio captioningAudio Question Answering | CodeCode Available | 2 | 5 |
| GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities | Jun 17, 2024 | Audio Question AnsweringInstruction Following | CodeCode Available | 2 | 5 |
| Multi-Scale Attention for Audio Question Answering | May 29, 2023 | Audio Question AnsweringQuestion Answering | CodeCode Available | 1 | 5 |
| XLNet: Generalized Autoregressive Pretraining for Language Understanding | Jun 19, 2019 | Audio Question AnsweringChinese Reading Comprehension | CodeCode Available | 1 | 5 |
| Temporal Reasoning via Audio Question Answering | Nov 21, 2019 | Audio Question AnsweringDiagnostic | CodeCode Available | 0 | 5 |
| Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context | Mar 19, 2025 | Audio captioningAudio Question Answering | CodeCode Available | 0 | 5 |
| Audiopedia: Audio QA with Knowledge | Dec 29, 2024 | Audio Question AnsweringEntity Linking | CodeCode Available | 0 | 5 |