| Retrieval-Augmented Generation for Large Language Models: A Survey | Dec 18, 2023 | HallucinationRAG | CodeCode Available | 4 |
| Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy | Dec 16, 2023 | Deep Learningimage-classification | CodeCode Available | 4 |
| Osprey: Pixel Understanding with Visual Instruction Tuning | Dec 15, 2023 | Language Modelling | CodeCode Available | 4 |
| TigerBot: An Open Multilingual Multitask LLM | Dec 14, 2023 | | CodeCode Available | 4 |
| FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Dec 13, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 4 |
| VILA: On Pre-training for Visual Language Models | Dec 12, 2023 | In-Context LearningLanguage Modelling | CodeCode Available | 4 |
| Gaussian Splatting SLAM | Dec 11, 2023 | 3DGS3D Reconstruction | CodeCode Available | 4 |
| PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark | Dec 8, 2023 | Federated LearningPersonalized Federated Learning | CodeCode Available | 4 |
| Pearl: A Production-ready Reinforcement Learning Agent | Dec 6, 2023 | Benchmarkingreinforcement-learning | CodeCode Available | 4 |
| Magicoder: Empowering Code Generation with OSS-Instruct | Dec 4, 2023 | Code GenerationHumanEval | CodeCode Available | 4 |
| SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM | Dec 4, 2023 | Camera Pose EstimationNovel View Synthesis | CodeCode Available | 4 |
| Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Dec 4, 2023 | Depth EstimationGPU | CodeCode Available | 4 |
| Mathematical Supplement for the gsplat Library | Dec 4, 2023 | | CodeCode Available | 4 |
| EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything | Dec 1, 2023 | Decoderimage-classification | CodeCode Available | 4 |
| RETSim: Resilient and Efficient Text Similarity | Nov 28, 2023 | Adversarial TextClustering | CodeCode Available | 4 |
| Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation | Nov 28, 2023 | | CodeCode Available | 4 |
| FLASC: A Flare-Sensitive Clustering Algorithm | Nov 27, 2023 | Clustering | CodeCode Available | 4 |
| SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution | Nov 27, 2023 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 4 |
| MEDITRON-70B: Scaling Medical Pretraining for Large Language Models | Nov 27, 2023 | ArticlesConditional Text Generation | CodeCode Available | 4 |
| Eliminating Domain Bias for Federated Learning in Representation Space | Nov 25, 2023 | Federated LearningPrivacy Preserving | CodeCode Available | 4 |
| DemoFusion: Democratising High-Resolution Image Generation With No $ | Nov 24, 2023 | Image Generation | CodeCode Available | 4 |
| Visual In-Context Prompting | Nov 22, 2023 | DecoderSegmentation | CodeCode Available | 4 |
| SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering | Nov 21, 2023 | | CodeCode Available | 4 |
| Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models | Nov 19, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning | Nov 18, 2023 | Transfer Learning | CodeCode Available | 4 |
| Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 | Nov 17, 2023 | | CodeCode Available | 4 |
| Video-LLaVA: Learning United Visual Representation by Alignment Before Projection | Nov 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code | Nov 14, 2023 | Language Model EvaluationLanguage Modeling | CodeCode Available | 4 |
| SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models | Nov 13, 2023 | Described Object DetectionLanguage Modeling | CodeCode Available | 4 |
| LCM-LoRA: A Universal Stable-Diffusion Acceleration Module | Nov 9, 2023 | GPUImage Generation | CodeCode Available | 4 |
| Leveraging Speculative Sampling and KV-Cache Optimizations Together for Generative AI using OpenVINO | Nov 8, 2023 | QuantizationText Generation | CodeCode Available | 4 |
| mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration | Nov 7, 2023 | 1 Image, 2*2 StitchingDecoder | CodeCode Available | 4 |
| Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves | Nov 7, 2023 | | CodeCode Available | 4 |
| OtterHD: A High-Resolution Multi-modality Model | Nov 7, 2023 | modelVisual Question Answering | CodeCode Available | 4 |
| AnyText: Multilingual Visual Text Generation And Editing | Nov 6, 2023 | Image GenerationOptical Character Recognition (OCR) | CodeCode Available | 4 |
| Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling | Nov 1, 2023 | HallucinationKnowledge Distillation | CodeCode Available | 4 |
| LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing | Nov 1, 2023 | AllImage Generation | CodeCode Available | 4 |
| TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch | Oct 27, 2023 | Self-Supervised LearningSpeech Enhancement | CodeCode Available | 4 |
| DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior | Oct 25, 2023 | 3D Generation | CodeCode Available | 4 |
| Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition | Oct 24, 2023 | | CodeCode Available | 4 |
| Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model | Oct 23, 2023 | | CodeCode Available | 4 |
| Open-Set Image Tagging with Multi-Grained Text Supervision | Oct 23, 2023 | Human-Object Interaction DetectionOpen Set Learning | CodeCode Available | 4 |
| Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots | Oct 19, 2023 | Social Navigation | CodeCode Available | 4 |
| Eureka: Human-Level Reward Design via Coding Large Language Models | Oct 19, 2023 | Decision MakingIn-Context Learning | CodeCode Available | 4 |
| DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors | Oct 18, 2023 | Image Animation | CodeCode Available | 4 |
| A General Theoretical Paradigm to Understand Learning from Human Preferences | Oct 18, 2023 | | CodeCode Available | 4 |
| Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection | Oct 17, 2023 | Fact VerificationQuestion Answering | CodeCode Available | 4 |
| Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V | Oct 17, 2023 | Interactive SegmentationReferring Expression | CodeCode Available | 4 |
| OpenAgents: An Open Platform for Language Agents in the Wild | Oct 16, 2023 | 2D Object Detection | CodeCode Available | 4 |
| A Survey on Video Diffusion Models | Oct 16, 2023 | Image GenerationSurvey | CodeCode Available | 4 |