| RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation | Feb 26, 2024 | Code Documentation GenerationCode Generation | CodeCode Available | 4 |
| Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | Feb 25, 2024 | Code GenerationHumanEval | CodeCode Available | 4 |
| AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling | Feb 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Generative Representational Instruction Tuning | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ScreenAgent: A Vision Language Model-driven Computer Control Agent | Feb 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Spirit LM: Interleaved Spoken and Written Language Model | Feb 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Image Fusion via Vision-Language Model | Feb 3, 2024 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Mixtral of Experts | Jan 8, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model | Dec 28, 2023 | Instance SegmentationLanguage Modeling | CodeCode Available | 4 |
| G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model | Dec 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Dec 13, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 4 |
| Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models | Nov 19, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Video-LLaVA: Learning United Visual Representation by Alignment Before Projection | Nov 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code | Nov 14, 2023 | Language Model EvaluationLanguage Modeling | CodeCode Available | 4 |
| SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models | Nov 13, 2023 | Described Object DetectionLanguage Modeling | CodeCode Available | 4 |
| mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration | Nov 7, 2023 | 1 Image, 2*2 StitchingDecoder | CodeCode Available | 4 |
| Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation | Oct 9, 2023 | Action RecognitionImage Generation | CodeCode Available | 4 |
| Efficient Post-training Quantization with FP8 Formats | Sep 26, 2023 | image-classificationImage Classification | CodeCode Available | 4 |
| Safurai 001: New Qualitative Approach for Code LLM Evaluation | Sep 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| A Survey on Large Language Model based Autonomous Agents | Aug 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ChatHaruhi: Reviving Anime Character in Reality via Large Language Model | Aug 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LISA: Reasoning Segmentation via Large Language Model | Aug 1, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding | Jun 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Reasoning with Language Model is Planning with World Model | May 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks | May 18, 2023 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Phoenix: Democratizing ChatGPT across Languages | Apr 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data | Apr 3, 2023 | ChatbotLanguage Modeling | CodeCode Available | 4 |
| ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge | Mar 24, 2023 | Information RetrievalLanguage Modeling | CodeCode Available | 4 |
| Tag2Text: Guiding Vision-Language Model via Image Tagging | Mar 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference | Mar 8, 2023 | Hyperparameter OptimizationLanguage Modeling | CodeCode Available | 4 |
| BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | Jan 30, 2023 | Generative Visual Question AnsweringImage Captioning | CodeCode Available | 4 |
| Optimizing Prompts for Text-to-Image Generation | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | Nov 9, 2022 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small | Nov 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization | Aug 21, 2022 | Abstractive Text SummarizationDecoder | CodeCode Available | 4 |
| Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss | Aug 5, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| N-Grammer: Augmenting Transformers with latent n-grams | Jul 13, 2022 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 4 |
| GLIPv2: Unifying Localization and Vision-Language Understanding | Jun 12, 2022 | 2D Object DetectionContrastive Learning | CodeCode Available | 4 |
| Flamingo: a Visual Language Model for Few-Shot Learning | Apr 29, 2022 | Few-Shot LearningGenerative Visual Question Answering | CodeCode Available | 4 |
| Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content? | Feb 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ControlVAE: Tuning, Analytical Properties, and Performance Analysis | Oct 31, 2020 | DisentanglementImage Generation | CodeCode Available | 4 |
| ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation | Jun 22, 2025 | GPUImage Generation | CodeCode Available | 3 |
| FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation | Jun 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning | Jun 3, 2025 | Decision MakingDiagnostic | CodeCode Available | 3 |
| VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation | May 26, 2025 | DecoderLanguage Modeling | CodeCode Available | 3 |
| LaViDa: A Large Diffusion Language Model for Multimodal Understanding | May 22, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| A Comprehensive Survey on Long Context Language Modeling | Mar 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks | Mar 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |