| InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction | Apr 17, 2023 | Zero-shot Named Entity Recognition (NER) | CodeCode Available | 2 |
| VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset | Apr 17, 2023 | Audio captioningAudio-Video Question Answering (AVQA) | CodeCode Available | 2 |
| ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human | Apr 16, 2023 | World Knowledge | CodeCode Available | 2 |
| Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement | Apr 14, 2023 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text | Apr 14, 2023 | Few-Shot Learning | CodeCode Available | 2 |
| Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar | Apr 14, 2023 | DecoderSelf-Supervised Learning | CodeCode Available | 2 |
| Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding | Apr 14, 2023 | 3D Object DetectionScene Understanding | CodeCode Available | 2 |
| Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction | Apr 13, 2023 | 3D-Aware Image Synthesis3D Generation | CodeCode Available | 2 |
| AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models | Apr 13, 2023 | Decision MakingMath | CodeCode Available | 2 |
| RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions | Apr 13, 2023 | Robust Camera Only 3D Object Detection | CodeCode Available | 2 |
| Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields | Apr 13, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| Expressive Text-to-Image Generation with Rich Text | Apr 13, 2023 | Image GenerationText Generation | CodeCode Available | 2 |
| DiffusionRig: Learning Personalized Priors for Facial Appearance Editing | Apr 13, 2023 | | CodeCode Available | 2 |
| Unifying and Personalizing Weakly-supervised Federated Medical Image Segmentation via Adaptive Representation and Aggregation | Apr 12, 2023 | channel selectionFederated Learning | CodeCode Available | 2 |
| An Edit Friendly DDPM Noise Space: Inversion and Manipulations | Apr 12, 2023 | Denoising | CodeCode Available | 2 |
| Unicom: Universal and Compact Representation Learning for Image Retrieval | Apr 12, 2023 | Image ClassificationImage Retrieval | CodeCode Available | 2 |
| SiLK -- Simple Learned Keypoints | Apr 12, 2023 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 2 |
| SAMM (Segment Any Medical Model): A 3D Slicer Integration to SAM | Apr 12, 2023 | Image SegmentationSegmentation | CodeCode Available | 2 |
| UniverSeg: Universal Medical Image Segmentation | Apr 12, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Measuring Re-identification Risk | Apr 12, 2023 | | CodeCode Available | 2 |
| InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions | Apr 12, 2023 | DenoisingMotion Generation | CodeCode Available | 2 |
| AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection | Apr 12, 2023 | Boundary DetectionNeural Architecture Search | CodeCode Available | 2 |
| Graph-based Topology Reasoning for Driving Scenes | Apr 11, 2023 | 3D Lane DetectionAutonomous Driving | CodeCode Available | 2 |
| ChemCrow: Augmenting large-language models with chemistry tools | Apr 11, 2023 | Computational chemistryDrug Discovery | CodeCode Available | 2 |
| RRHF: Rank Responses to Align Language Models with Human Feedback without tears | Apr 11, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Diffusion Recommender Model | Apr 11, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction | Apr 11, 2023 | 3D Semantic Occupancy Prediction3D Semantic Scene Completion | CodeCode Available | 2 |
| SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes | Apr 11, 2023 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| Automatic Gradient Descent: Deep Learning without Hyperparameters | Apr 11, 2023 | Deep LearningSecond-order methods | CodeCode Available | 2 |
| Detection Transformer with Stable Matching | Apr 10, 2023 | DecoderPosition | CodeCode Available | 2 |
| GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations | Apr 10, 2023 | Face IdentificationFace Recognition | CodeCode Available | 2 |
| SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries | Apr 10, 2023 | Dense Video CaptioningVideo Captioning | CodeCode Available | 2 |
| Deep Image Matting: A Comprehensive Survey | Apr 10, 2023 | Image MattingReferring Image Matting | CodeCode Available | 2 |
| CherryPicker: Semantic Skeletonization and Topological Reconstruction of Cherry Trees | Apr 10, 2023 | Monocular ReconstructionPlant Phenotyping | CodeCode Available | 2 |
| Ambiguous Medical Image Segmentation using Diffusion Models | Apr 10, 2023 | DiagnosticDiversity | CodeCode Available | 2 |
| Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT | Apr 10, 2023 | Graph LearningKnowledge Graphs | CodeCode Available | 2 |
| Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Dataset Augmented by ChatGPT | Apr 10, 2023 | Community DetectionGraph Classification | CodeCode Available | 2 |
| Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition | Apr 10, 2023 | image-classificationImage Classification | CodeCode Available | 2 |
| GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner | Apr 10, 2023 | Self-Supervised Learning | CodeCode Available | 2 |
| Point-SLAM: Dense Neural Point Cloud-based SLAM | Apr 9, 2023 | Simultaneous Localization and Mapping | CodeCode Available | 2 |
| Slideflow: Deep Learning for Digital Histopathology with Real-Time Whole-Slide Visualization | Apr 9, 2023 | Deep LearningHistopathological Image Classification | CodeCode Available | 2 |
| RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning | Apr 9, 2023 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 2 |
| Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions | Apr 9, 2023 | Video Captioning | CodeCode Available | 2 |
| HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation | Apr 9, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| EMP-SSL: Towards Self-Supervised Learning in One Training Epoch | Apr 8, 2023 | QuantizationSelf-Supervised Learning | CodeCode Available | 2 |
| DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models | Apr 8, 2023 | Drug DiscoveryProtein Design | CodeCode Available | 2 |
| RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors | Apr 8, 2023 | Image DehazingVocal Bursts Intensity Prediction | CodeCode Available | 2 |
| Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization | Apr 8, 2023 | | CodeCode Available | 2 |
| Similarity search in the blink of an eye with compressed indices | Apr 7, 2023 | Quantization | CodeCode Available | 2 |
| ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation | Apr 7, 2023 | 3D ReconstructionHomography Estimation | CodeCode Available | 2 |