| Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | Jan 16, 2025 | Causal Inferencecounterfactual | CodeCode Available | 4 |
| Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data | Apr 3, 2023 | ChatbotLanguage Modeling | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| RaTEScore: A Metric for Radiology Report Generation | Jun 24, 2024 | DiagnosticEntity Embeddings | CodeCode Available | 4 |
| MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series | May 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| AutoWebGLM: A Large Language Model-based Web Navigating Agent | Apr 4, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| Partition Generative Modeling: Masked Modeling Without Masks | May 24, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 4 |
| Phoenix: Democratizing ChatGPT across Languages | Apr 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Optimizing Prompts for Text-to-Image Generation | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Apr 15, 2024 | Image GenerationImage Restoration | CodeCode Available | 4 |
| Efficient Post-training Quantization with FP8 Formats | Sep 26, 2023 | image-classificationImage Classification | CodeCode Available | 4 |
| Gated Delta Networks: Improving Mamba2 with Delta Rule | Dec 9, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 4 |
| Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code | Nov 14, 2023 | Language Model EvaluationLanguage Modeling | CodeCode Available | 4 |
| AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct | May 23, 2024 | Class-level Code GenerationCode Completion | CodeCode Available | 4 |
| N-Grammer: Augmenting Transformers with latent n-grams | Jul 13, 2022 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 4 |
| OLMoE: Open Mixture-of-Experts Language Models | Sep 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models | Jun 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation | Mar 11, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 3 |
| DPLM-2: A Multimodal Diffusion Protein Language Model | Oct 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Multi-objective Asynchronous Successive Halving | Jun 23, 2021 | FairnessHyperparameter Optimization | CodeCode Available | 3 |
| MultiModal-GPT: A Vision and Language Model for Dialogue with Humans | May 8, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| Multi-agent Architecture Search via Agentic Supernet | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Multimodal Table Understanding | Jun 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Discovering Language Model Behaviors with Model-Written Evaluations | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 |