| PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding | Dec 7, 2023 | Diffusion PersonalizationDiffusion Personalization Tuning Free | CodeCode Available | 6 | 5 |
| Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models | Oct 6, 2023 | Decision MakingRetrieval | CodeCode Available | 6 | 5 |
| CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis | Mar 25, 2022 | Code GenerationHumanEval | CodeCode Available | 6 | 5 |
| TaskBench: Benchmarking Large Language Models for Task Automation | Nov 30, 2023 | BenchmarkingParameter Prediction | CodeCode Available | 6 | 5 |
| MemGPT: Towards LLMs as Operating Systems | Oct 12, 2023 | Management | CodeCode Available | 6 | 5 |
| CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers | May 29, 2022 | Text-to-Video GenerationVideo Generation | CodeCode Available | 6 | 5 |
| TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis | Oct 5, 2022 | Action RecognitionAnomaly Detection | CodeCode Available | 6 | 5 |
| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 | 5 |
| TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications | Nov 6, 2023 | AutoMLHyperparameter Optimization | CodeCode Available | 6 | 5 |
| AudioGen: Textually Guided Audio Generation | Sep 30, 2022 | Audio GenerationDescriptive | CodeCode Available | 6 | 5 |
| Data Formulator: AI-powered Concept-driven Visualization Authoring | Sep 18, 2023 | AI Agent | CodeCode Available | 6 | 5 |
| SoundStorm: Efficient Parallel Audio Generation | May 16, 2023 | Audio Generation | CodeCode Available | 6 | 5 |
| ART: Automatic multi-step reasoning and tool-use for large language models | Mar 16, 2023 | MMLU | CodeCode Available | 6 | 5 |
| Distributed Inference and Fine-tuning of Large Language Models Over The Internet | Dec 13, 2023 | | CodeCode Available | 6 | 5 |
| Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution | Jul 12, 2023 | FairnessImage Classification | CodeCode Available | 6 | 5 |
| Simple and Controllable Music Generation | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| RAGAS: Automated Evaluation of Retrieval Augmented Generation | Sep 26, 2023 | RAGRetrieval | CodeCode Available | 6 | 5 |
| MusicLM: Generating Music From Text | Jan 26, 2023 | Music GenerationText-to-Music Generation | CodeCode Available | 6 | 5 |
| Long Document Summarization with Top-down and Bottom-up Inference | Mar 15, 2022 | Text Summarization | CodeCode Available | 6 | 5 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 | 5 |
| Nerfstudio: A Modular Framework for Neural Radiance Field Development | Feb 8, 2023 | NeRFNovel View Synthesis | CodeCode Available | 6 | 5 |
| Extending Context Window of Large Language Models via Positional Interpolation | Jun 27, 2023 | Document SummarizationLanguage Modeling | CodeCode Available | 6 | 5 |
| Seamless: Multilingual Expressive and Streaming Speech Translation | Dec 8, 2023 | automatic-speech-translationMachine Translation | CodeCode Available | 6 | 5 |
| SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models | Nov 28, 2023 | Video Generation | CodeCode Available | 6 | 5 |
| SegRNN: Segment Recurrent Neural Network for Long-Term Time Series Forecasting | Aug 22, 2023 | Time SeriesTime Series Forecasting | CodeCode Available | 6 | 5 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 | 5 |
| HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face | Mar 30, 2023 | Automatic Machine Learning Model SelectionModel Selection | CodeCode Available | 6 | 5 |
| U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image Segmentation | Nov 29, 2023 | Computational EfficiencyDecoder | CodeCode Available | 6 | 5 |
| FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning | Nov 6, 2022 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 6 | 5 |
| AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Jun 1, 2023 | Autonomous DrivingCloud Computing | CodeCode Available | 6 | 5 |
| OxfordVGG Submission to the EGO4D AV Transcription Challenge | Jul 18, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 6 | 5 |
| Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca | Apr 17, 2023 | | CodeCode Available | 6 | 5 |
| Training language models to follow instructions with human feedback | Mar 4, 2022 | Question Answering | CodeCode Available | 6 | 5 |
| MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation | Sep 19, 2022 | DecoderImage Generation | CodeCode Available | 5 | 5 |
| Unified Training of Universal Time Series Forecasting Transformers | Feb 4, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 5 | 5 |
| InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework | Apr 16, 2025 | Image Generation | CodeCode Available | 5 | 5 |
| TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis | Oct 21, 2024 | Anomaly DetectionImputation | CodeCode Available | 5 | 5 |
| Learning Flow Fields in Attention for Controllable Person Image Generation | Dec 11, 2024 | AttributeImage Generation | CodeCode Available | 5 | 5 |
| MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts | Apr 13, 2024 | DiversityLanguage Modeling | CodeCode Available | 5 | 5 |
| Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively | Jan 5, 2024 | image-classificationImage Classification | CodeCode Available | 5 | 5 |
| Common 7B Language Models Already Possess Strong Math Capabilities | Mar 7, 2024 | GSM8KMath | CodeCode Available | 5 | 5 |
| Fast On-device LLM Inference with NPUs | Jul 8, 2024 | CPUGPU | CodeCode Available | 5 | 5 |
| VideoCrafter1: Open Diffusion Models for High-Quality Video Generation | Oct 30, 2023 | Text-to-Video GenerationVideo Generation | CodeCode Available | 5 | 5 |
| Efficient Multimodal Learning from Data-centric Perspective | Feb 18, 2024 | Image ClassificationReferring Expression Comprehension | CodeCode Available | 5 | 5 |
| RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation | Aug 15, 2024 | DiagnosticRAG | CodeCode Available | 5 | 5 |
| Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference | Dec 18, 2024 | DecoderRetrieval | CodeCode Available | 5 | 5 |
| StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning | Jun 5, 2024 | Automatic Speech Recognition (ASR)de-en | CodeCode Available | 5 | 5 |
| A ConvNet for the 2020s | Jan 10, 2022 | ClassificationDomain Generalization | CodeCode Available | 5 | 5 |
| A Time Series is Worth 64 Words: Long-term Forecasting with Transformers | Nov 27, 2022 | Multivariate Time Series ForecastingRepresentation Learning | CodeCode Available | 5 | 5 |
| Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization | Apr 15, 2024 | Audio Generation | CodeCode Available | 5 | 5 |