| GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis | Jan 31, 2023 | Face GenerationLip Reading | CodeCode Available | 4 |
| BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | Jan 30, 2023 | Generative Visual Question AnsweringImage Captioning | CodeCode Available | 4 |
| DepGraph: Towards Any Structural Pruning | Jan 30, 2023 | Network PruningNeural Network Compression | CodeCode Available | 4 |
| ArchiSound: Audio Generation with Diffusion | Jan 30, 2023 | Audio GenerationGPU | CodeCode Available | 4 |
| AudioLDM: Text-to-Audio Generation with Latent Diffusion Models | Jan 29, 2023 | AudioCapsAudio Generation | CodeCode Available | 4 |
| EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary Computation | Jan 29, 2023 | GPUNavigate | CodeCode Available | 4 |
| Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | Jan 27, 2023 | GPUImage Generation | CodeCode Available | 4 |
| Deep Industrial Image Anomaly Detection: A Survey | Jan 27, 2023 | Anomaly DetectionDeep Learning | CodeCode Available | 4 |
| Open Problems in Applied Deep Learning | Jan 26, 2023 | AutoMLDeep Learning | CodeCode Available | 4 |
| GLIGEN: Open-Set Grounded Text-to-Image Generation | Jan 17, 2023 | Conditional Text-to-Image SynthesisImage Generation | CodeCode Available | 4 |
| Mastering Diverse Domains through World Models | Jan 10, 2023 | Atari Games 100kDecision Making | CodeCode Available | 4 |
| SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot | Jan 2, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 4 |
| Building a Culture of Reproducibility in Academic Research | Dec 27, 2022 | Cultural Vocal Bursts Intensity Prediction | CodeCode Available | 4 |
| Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation | Dec 22, 2022 | Style TransferText-to-Video Generation | CodeCode Available | 4 |
| Planning-oriented Autonomous Driving | Dec 20, 2022 | Autonomous DrivingBench2Drive | CodeCode Available | 4 |
| Optimizing Prompts for Text-to-Image Generation | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| One Embedder, Any Task: Instruction-Finetuned Text Embeddings | Dec 19, 2022 | Information RetrievalLearning Word Embeddings | CodeCode Available | 4 |
| The case for 4-bit precision: k-bit Inference Scaling Laws | Dec 19, 2022 | Quantization | CodeCode Available | 4 |
| Decoder Tuning: Efficient Language Understanding as Decoding | Dec 16, 2022 | DecoderNatural Language Understanding | CodeCode Available | 4 |
| Constitutional AI: Harmlessness from AI Feedback | Dec 15, 2022 | Decision Making | CodeCode Available | 4 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 |
| TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities | Dec 13, 2022 | Decoder | CodeCode Available | 4 |
| FSID: Fully Synthetic Image Denoising via Procedural Scene Generation | Dec 7, 2022 | DenoisingImage Denoising | CodeCode Available | 4 |
| InternVideo: General Video Foundation Models via Generative and Discriminative Learning | Dec 6, 2022 | Action ClassificationAction Recognition | CodeCode Available | 4 |
| NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors | Dec 6, 2022 | 3D Generation3D geometry | CodeCode Available | 4 |
| Images Speak in Images: A Generalist Painter for In-Context Visual Learning | Dec 5, 2022 | In-Context LearningKeypoint Detection | CodeCode Available | 4 |
| Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code Generation | Dec 2, 2022 | Code GenerationPosition | CodeCode Available | 4 |
| Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model | Dec 1, 2022 | Colorizationcompressed sensing | CodeCode Available | 4 |
| PyTorch Adapt | Nov 28, 2022 | Domain Adaptation | CodeCode Available | 4 |
| Recent Advances in RecBole: Extensions with more Practical Considerations | Nov 28, 2022 | | CodeCode Available | 4 |
| DAMO-YOLO : A Report on Real-Time Object Detection Design | Nov 23, 2022 | CPUNeural Architecture Search | CodeCode Available | 4 |
| BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision | Nov 18, 2022 | 3D Object Detection | CodeCode Available | 4 |
| Null-text Inversion for Editing Real Images using Guided Diffusion Models | Nov 17, 2022 | Image GenerationText-based Image Editing | CodeCode Available | 4 |
| Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks | Nov 17, 2022 | DecoderLanguage Modelling | CodeCode Available | 4 |
| DiffusionDet: Diffusion Model for Object Detection | Nov 17, 2022 | Denoisingmodel | CodeCode Available | 4 |
| Holistic Evaluation of Language Models | Nov 16, 2022 | FairnessQuestion Answering | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |
| Diffusion Models for Medical Image Analysis: A Comprehensive Survey | Nov 14, 2022 | DenoisingMedical Image Analysis | CodeCode Available | 4 |
| AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities | Nov 12, 2022 | Contrastive LearningCross-Modal Retrieval | CodeCode Available | 4 |
| InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions | Nov 10, 2022 | 2D Object DetectionClassification | CodeCode Available | 4 |
| BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | Nov 9, 2022 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small | Nov 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Desiderata for next generation of ML model serving | Oct 26, 2022 | modelPosition | CodeCode Available | 4 |
| DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality | Oct 25, 2022 | Deep Reinforcement LearningGPU | CodeCode Available | 4 |
| High Fidelity Neural Audio Compression | Oct 24, 2022 | Audio CompressionAudio Signal Processing | CodeCode Available | 4 |
| BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining | Oct 19, 2022 | Document ClassificationLanguage Modelling | CodeCode Available | 4 |
| Inception-Based Crowd Counting -- Being Fast while Remaining Accurate | Oct 18, 2022 | Crowd Counting | CodeCode Available | 4 |
| Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs | Oct 18, 2022 | Aspect ExtractionKnowledge Graphs | CodeCode Available | 4 |
| Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective | Oct 16, 2022 | Coreference ResolutionMultiple-choice | CodeCode Available | 4 |
| DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation | Oct 14, 2022 | Natural Language UnderstandingText Generation | CodeCode Available | 4 |