| SALMONN: Towards Generic Hearing Abilities for Large Language Models | Oct 20, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 3 |
| Llemma: An Open Language Model For Mathematics | Oct 16, 2023 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 3 |
| OceanGPT: A Large Language Model for Ocean Science Tasks | Oct 3, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| How Can Recommender Systems Benefit from Large Language Models: A Survey | Jun 9, 2023 | EthicsFeature Engineering | CodeCode Available | 3 |
| HuatuoGPT, towards Taming Language Model to Be a Doctor | May 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia | May 23, 2023 | ChatbotHallucination | CodeCode Available | 3 |
| Hierarchical Prompting Assists Large Language Model on Web Navigation | May 23, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 3 |
| RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | May 22, 2023 | Language ModellingLarge Language Model | CodeCode Available | 3 |
| SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities | May 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification | May 16, 2023 | DecoderLanguage Modeling | CodeCode Available | 3 |
| X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages | May 7, 2023 | AttributeInstruction Following | CodeCode Available | 3 |
| ThoughtSource: A central hub for large language model reasoning data | Jan 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Seq vs Seq: An Open Suite of Paired Encoders and Decoders | Jul 15, 2025 | DecoderLarge Language Model | CodeCode Available | 2 |
| DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering | Jul 15, 2025 | BenchmarkingInstruction Following | CodeCode Available | 2 |
| Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery | Jul 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context | Jun 26, 2025 | Large Language ModelMultimodal Reasoning | CodeCode Available | 2 |
| Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning | Jun 23, 2025 | GPULarge Language Model | CodeCode Available | 2 |
| Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster | Jun 22, 2025 | DecoderImage Segmentation | CodeCode Available | 2 |
| SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning | Jun 18, 2025 | Caption GenerationDescriptive | CodeCode Available | 2 |
| video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models | Jun 18, 2025 | Audio captioningLarge Language Model | CodeCode Available | 2 |
| SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks | Jun 13, 2025 | BenchmarkingLarge Language Model | CodeCode Available | 2 |
| AutoMind: Adaptive Knowledgeable Agent for Automated Data Science | Jun 12, 2025 | Code GenerationLarge Language Model | CodeCode Available | 2 |
| Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions | Jun 9, 2025 | Large Language ModelReinforcement Learning (RL) | CodeCode Available | 2 |
| CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale | Jun 3, 2025 | Large Language Model | CodeCode Available | 2 |
| Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning | Jun 2, 2025 | Fact VerificationLanguage Modeling | CodeCode Available | 2 |
| Compiler Optimization via LLM Reasoning for Efficient Model Serving | Jun 2, 2025 | Compiler OptimizationLarge Language Model | CodeCode Available | 2 |
| FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion | Jun 1, 2025 | Audio captioningCaption Generation | CodeCode Available | 2 |
| GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models | May 30, 2025 | ClassificationDisaster Response | CodeCode Available | 2 |
| ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering | May 29, 2025 | Large Language ModelPrompt Engineering | CodeCode Available | 2 |
| cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning | May 28, 2025 | CAD ReconstructionLarge Language Model | CodeCode Available | 2 |
| Zero-Shot Vision Encoder Grafting via LLM Surrogates | May 28, 2025 | DecoderLanguage Modeling | CodeCode Available | 2 |
| LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms | May 27, 2025 | Bayesian OptimizationBenchmarking | CodeCode Available | 2 |
| WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding | May 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Web-Shepherd: Advancing PRMs for Reinforcing Web Agents | May 21, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 2 |
| CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming | May 19, 2025 | FairnessLarge Language Model | CodeCode Available | 2 |
| LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners | May 17, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents | May 17, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement | May 13, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language Models | May 12, 2025 | Large Language ModelSociology | CodeCode Available | 2 |
| MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering | May 12, 2025 | Large Language Modelreinforcement-learning | CodeCode Available | 2 |
| DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation | May 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance | May 11, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation | May 7, 2025 | 3D GenerationAttribute | CodeCode Available | 2 |
| MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based Agents | May 4, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model | Apr 13, 2025 | DiagnosticLanguage Modeling | CodeCode Available | 2 |
| SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model | Apr 13, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Apr 10, 2025 | Contrastive LearningLanguage Modeling | CodeCode Available | 2 |