| Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases | Feb 5, 2024 | Prompt Engineering | CodeCode Available | 7 |
| Robust Inverse Graphics via Probabilistic Inference | Feb 2, 2024 | NeRF | CodeCode Available | 7 |
| Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting | Jan 29, 2024 | Depth EstimationDynamic Reconstruction | CodeCode Available | 7 |
| MoE-LLaVA: Mixture of Experts for Large Vision-Language Models | Jan 29, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 7 |
| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 |
| Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers | Jan 21, 2024 | Image Generation | CodeCode Available | 7 |
| Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads | Jan 19, 2024 | | CodeCode Available | 7 |
| VMamba: Visual State Space Model | Jan 18, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 7 |
| Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering | Jan 16, 2024 | Code GenerationPrompt Engineering | CodeCode Available | 7 |
| HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance | Jan 16, 2024 | In-Context Learning | CodeCode Available | 7 |
| Exploring Compressed Image Representation as a Perceptual Proxy: A Study | Jan 14, 2024 | Image CompressionPerceptual Distance | CodeCode Available | 7 |
| PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models | Jan 10, 2024 | GPUImage Generation | CodeCode Available | 7 |
| DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference | Jan 9, 2024 | BenchmarkingText Generation | CodeCode Available | 7 |
| Bilateral Reference for High-Resolution Dichotomous Image Segmentation | Jan 7, 2024 | Camouflaged Object SegmentationDichotomous Image Segmentation | CodeCode Available | 7 |
| From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Jan 3, 2024 | DiversityQuantization | CodeCode Available | 7 |
| OpenVoice: Versatile Instant Voice Cloning | Dec 3, 2023 | RhythmVoice Cloning | CodeCode Available | 7 |
| MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning | Oct 14, 2023 | Image ClassificationImage Description | CodeCode Available | 7 |
| Prometheus: Inducing Fine-grained Evaluation Capability in Language Models | Oct 12, 2023 | Language ModellingLarge Language Model | CodeCode Available | 7 |
| DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines | Oct 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset | Sep 21, 2023 | ChatbotDiversity | CodeCode Available | 7 |
| Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena | Jun 9, 2023 | ChatbotLanguage Modelling | CodeCode Available | 7 |
| Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold | May 18, 2023 | Image ManipulationPoint Tracking | CodeCode Available | 7 |
| Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models | May 6, 2023 | Math | CodeCode Available | 7 |
| Full Scaling Automation for Sustainable Development of Green Data Centers | May 1, 2023 | Cloud ComputingCPU | CodeCode Available | 7 |
| EasySpider: A No-Code Visual System for Crawling the Web | Apr 30, 2023 | Data IntegrationMarketing | CodeCode Available | 7 |
| Measuring Massive Multitask Chinese Understanding | Apr 25, 2023 | All | CodeCode Available | 7 |
| MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models | Apr 20, 2023 | Image DescriptionLanguage Modelling | CodeCode Available | 7 |
| Low-code LLM: Graphical User Interface over Large Language Models | Apr 17, 2023 | Prompt Engineering | CodeCode Available | 7 |
| Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models | Mar 8, 2023 | | CodeCode Available | 7 |
| LLaMA: Open and Efficient Foundation Language Models | Feb 27, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 7 |
| Adding Conditional Control to Text-to-Image Diffusion Models | Feb 10, 2023 | Image GenerationLayout-to-Image Generation | CodeCode Available | 7 |
| MaskSketch: Unpaired Structure-guided Masked Image Generation | Feb 10, 2023 | Conditional Image GenerationDiversity | CodeCode Available | 7 |
| Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models | Feb 6, 2023 | Scheduling | CodeCode Available | 7 |
| Domain Expansion of Image Generators | Jan 12, 2023 | | CodeCode Available | 7 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 |
| Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP | Dec 28, 2022 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| Elixir: Train a Large Language Model on a Small GPU Cluster | Dec 10, 2022 | CPUGPU | CodeCode Available | 7 |
| Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout | Nov 28, 2022 | | CodeCode Available | 7 |
| GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers | Oct 31, 2022 | GPULanguage Modelling | CodeCode Available | 7 |
| Improving Sample Quality of Diffusion Models Using Self-Attention Guidance | Oct 3, 2022 | DenoisingDiversity | CodeCode Available | 7 |
| AudioLM: a Language Modeling Approach to Audio Generation | Sep 7, 2022 | Audio Generation | CodeCode Available | 7 |
| YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors | Jul 6, 2022 | 2D Object DetectionGPU | CodeCode Available | 7 |
| Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion | Feb 21, 2022 | BinarizationModel Optimization | CodeCode Available | 7 |
| StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation | Dec 19, 2023 | DenoisingImage Generation | CodeCode Available | 6 |
| Distributed Inference and Fine-tuning of Large Language Models Over The Internet | Dec 13, 2023 | | CodeCode Available | 6 |
| SGLang: Efficient Execution of Structured Language Model Programs | Dec 12, 2023 | Few-Shot LearningLanguage Modeling | CodeCode Available | 6 |
| Seamless: Multilingual Expressive and Streaming Speech Translation | Dec 8, 2023 | automatic-speech-translationMachine Translation | CodeCode Available | 6 |
| PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding | Dec 7, 2023 | Diffusion PersonalizationDiffusion Personalization Tuning Free | CodeCode Available | 6 |
| Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Dec 1, 2023 | 2D Pose EstimationCommon Sense Reasoning | CodeCode Available | 6 |
| RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback | Dec 1, 2023 | HallucinationImage Captioning | CodeCode Available | 6 |