| SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization | Dec 20, 2022 | Dialogue GenerationLanguage Modeling | CodeCode Available | 2 |
| InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds | Dec 20, 2022 | | CodeCode Available | 2 |
| A Length-Extrapolatable Transformer | Dec 20, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Large Language Models Are Reasoning Teachers | Dec 20, 2022 | | CodeCode Available | 2 |
| Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions | Dec 20, 2022 | HallucinationQuestion Answering | CodeCode Available | 2 |
| Reference-based Image and Video Super-Resolution via C2-Matching | Dec 19, 2022 | Image Super-ResolutionReference-based Super-Resolution | CodeCode Available | 2 |
| NusaCrowd: Open Source Initiative for Indonesian NLP Resources | Dec 19, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges | Dec 19, 2022 | | CodeCode Available | 2 |
| MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation | Dec 19, 2022 | cross-modal alignmentDenoising | CodeCode Available | 2 |
| Panoptic Lifting for 3D Scene Understanding with Neural Fields | Dec 19, 2022 | 2D Panoptic SegmentationPanoptic Segmentation | CodeCode Available | 2 |
| Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement | Dec 18, 2022 | Computational EfficiencySpeech Enhancement | CodeCode Available | 2 |
| BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| POTATO: The Portable Text Annotation Tool | Dec 16, 2022 | Active Learningtext annotation | CodeCode Available | 2 |
| One-Stage Cascade Refinement Networks for Infrared Small Target Detection | Dec 16, 2022 | | CodeCode Available | 2 |
| PointAvatar: Deformable Point-based Head Avatars from Videos | Dec 16, 2022 | | CodeCode Available | 2 |
| CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation | Dec 16, 2022 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Hard Sample Aware Network for Contrastive Deep Graph Clustering | Dec 16, 2022 | AttributeClustering | CodeCode Available | 2 |
| Generating Realistic Brain MRIs via a Conditional Diffusion Probabilistic Model | Dec 15, 2022 | Anatomy | CodeCode Available | 2 |
| Transformers learn in-context by gradient descent | Dec 15, 2022 | In-Context LearningMeta-Learning | CodeCode Available | 2 |
| MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation | Dec 15, 2022 | Face SwappingMeta-Learning | CodeCode Available | 2 |
| Machine Learning Coarse-Grained Potentials of Protein Thermodynamics | Dec 14, 2022 | | CodeCode Available | 2 |
| Diffusion Probabilistic Models beat GANs on Medical Images | Dec 14, 2022 | DenoisingDiversity | CodeCode Available | 2 |
| SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning | Dec 14, 2022 | Multi-agent Reinforcement Learningreinforcement-learning | CodeCode Available | 2 |
| 3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping | Dec 14, 2022 | Generative Adversarial NetworkImage Generation | CodeCode Available | 2 |
| NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior | Dec 14, 2022 | NeRFPose Estimation | CodeCode Available | 2 |
| Trust, but Verify: Cross-Modality Fusion for HD Map Change Detection | Dec 14, 2022 | Change Detection | CodeCode Available | 2 |
| MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare | Dec 13, 2022 | 3D Object Detection6D Pose Estimation | CodeCode Available | 2 |
| Foresight -- Generative Pretrained Transformer (GPT) for Modelling of Patient Timelines using EHRs | Dec 13, 2022 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 2 |
| Neural Cloth Simulation | Dec 13, 2022 | Physical Simulations | CodeCode Available | 2 |
| Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders | Dec 13, 2022 | 3D Point Cloud Classification3D Point Cloud Linear Classification | CodeCode Available | 2 |
| CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet | Dec 12, 2022 | | CodeCode Available | 2 |
| The Stable Artist: Steering Semantics in Diffusion Latent Space | Dec 12, 2022 | Image Generation | CodeCode Available | 2 |
| NMS Strikes Back | Dec 12, 2022 | Attributeobject-detection | CodeCode Available | 2 |
| PyPop7: A Pure-Python Library for Population-Based Black-Box Optimization | Dec 12, 2022 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 2 |
| Recurrent Vision Transformers for Object Detection with Event Cameras | Dec 11, 2022 | Event-based visionGPU | CodeCode Available | 2 |
| SchNetPack 2.0: A neural network toolbox for atomistic machine learning | Dec 11, 2022 | | CodeCode Available | 2 |
| NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction | Dec 10, 2022 | Surface Reconstruction | CodeCode Available | 2 |
| ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding | Dec 10, 2022 | 3D Architecture3D Classification | CodeCode Available | 2 |
| MAGVIT: Masked Generative Video Transformer | Dec 10, 2022 | Multi-Task LearningText-to-Video Generation | CodeCode Available | 2 |
| Joint Spatio-Temporal Modeling for the Semantic Change Detection in Remote Sensing Images | Dec 10, 2022 | Change Detection | CodeCode Available | 2 |
| Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis | Dec 9, 2022 | AttributeImage Generation | CodeCode Available | 2 |
| Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints | Dec 9, 2022 | Mixture-of-Experts | CodeCode Available | 2 |
| 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions | Dec 9, 2022 | 4kDecoder | CodeCode Available | 2 |
| SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation | Dec 8, 2022 | 3D Reconstruction3D Shape Generation | CodeCode Available | 2 |
| Learning Video Representations from Large Language Models | Dec 8, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| BEVBert: Multimodal Map Pre-training for Language-guided Navigation | Dec 8, 2022 | Vision and Language NavigationVisual Navigation | CodeCode Available | 2 |
| UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation | Dec 8, 2022 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Editing Models with Task Arithmetic | Dec 8, 2022 | NegationTask Arithmetic | CodeCode Available | 2 |
| LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | Dec 8, 2022 | | CodeCode Available | 2 |
| Deep Architectures for Content Moderation and Movie Content Rating | Dec 8, 2022 | Action RecognitionGenre classification | CodeCode Available | 2 |