| SALMONN: Towards Generic Hearing Abilities for Large Language Models | Oct 20, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 3 |
| Llemma: An Open Language Model For Mathematics | Oct 16, 2023 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 3 |
| OceanGPT: A Large Language Model for Ocean Science Tasks | Oct 3, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| How Can Recommender Systems Benefit from Large Language Models: A Survey | Jun 9, 2023 | EthicsFeature Engineering | CodeCode Available | 3 |
| HuatuoGPT, towards Taming Language Model to Be a Doctor | May 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia | May 23, 2023 | ChatbotHallucination | CodeCode Available | 3 |
| Hierarchical Prompting Assists Large Language Model on Web Navigation | May 23, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 3 |
| RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | May 22, 2023 | Language ModellingLarge Language Model | CodeCode Available | 3 |
| SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities | May 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification | May 16, 2023 | DecoderLanguage Modeling | CodeCode Available | 3 |
| X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages | May 7, 2023 | AttributeInstruction Following | CodeCode Available | 3 |
| ThoughtSource: A central hub for large language model reasoning data | Jan 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering | Jul 15, 2025 | BenchmarkingInstruction Following | CodeCode Available | 2 |
| Seq vs Seq: An Open Suite of Paired Encoders and Decoders | Jul 15, 2025 | DecoderLarge Language Model | CodeCode Available | 2 |
| Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery | Jul 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context | Jun 26, 2025 | Large Language ModelMultimodal Reasoning | CodeCode Available | 2 |
| Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning | Jun 23, 2025 | GPULarge Language Model | CodeCode Available | 2 |
| Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster | Jun 22, 2025 | DecoderImage Segmentation | CodeCode Available | 2 |
| video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models | Jun 18, 2025 | Audio captioningLarge Language Model | CodeCode Available | 2 |
| SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning | Jun 18, 2025 | Caption GenerationDescriptive | CodeCode Available | 2 |
| SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks | Jun 13, 2025 | BenchmarkingLarge Language Model | CodeCode Available | 2 |
| AutoMind: Adaptive Knowledgeable Agent for Automated Data Science | Jun 12, 2025 | Code GenerationLarge Language Model | CodeCode Available | 2 |
| Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions | Jun 9, 2025 | Large Language ModelReinforcement Learning (RL) | CodeCode Available | 2 |
| CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale | Jun 3, 2025 | Large Language Model | CodeCode Available | 2 |