| KBLaM: Knowledge Base augmented Language Model | Oct 14, 2024 | 8kGPU | CodeCode Available | 5 |
| Repetition Improves Language Model Embeddings | Feb 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation | Feb 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models | Sep 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond | Aug 24, 2023 | Chart Question AnsweringFS-MEVQA | CodeCode Available | 5 |
| CodeGen2: Lessons for Training LLMs on Programming and Natural Languages | May 3, 2023 | Causal Language ModelingDecoder | CodeCode Available | 5 |
| 4th PVUW MeViS 3rd Place Report: Sa2VA | Apr 1, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Assessing Language Model Deployment with Risk Cards | Mar 31, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning | Mar 7, 2025 | Emotion RecognitionLanguage Modeling | CodeCode Available | 5 |
| CogAgent: A Visual Language Model for GUI Agents | Dec 14, 2023 | Language Modeling | CodeCode Available | 5 |
| PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU | Dec 16, 2023 | CPUGPU | CodeCode Available | 5 |
| Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model | Jun 28, 2023 | HallucinationKnowledge Graphs | CodeCode Available | 5 |
| Ovis: Structural Embedding Alignment for Multimodal Large Language Model | May 31, 2024 | Language ModelingMultimodal Large Language Model | CodeCode Available | 5 |
| Improving Text-To-Audio Models with Synthetic Captions | Jun 18, 2024 | AudioCapsAudio captioning | CodeCode Available | 5 |
| Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models | May 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Randomized Autoregressive Visual Generation | Nov 1, 2024 | Image GenerationLanguage Modeling | CodeCode Available | 5 |
| Show-o2: Improved Native Unified Multimodal Models | Jun 18, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| N-Grammer: Augmenting Transformers with latent n-grams | Jul 13, 2022 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 4 |
| Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content? | Feb 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Dec 13, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 4 |
| Gated Delta Networks: Improving Mamba2 with Delta Rule | Dec 9, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 4 |
| Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning | Mar 20, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| Flamingo: a Visual Language Model for Few-Shot Learning | Apr 29, 2022 | Few-Shot LearningGenerative Visual Question Answering | CodeCode Available | 4 |