| Equivariant Multi-Modality Image Fusion | May 19, 2023 | Self-Supervised Learning | CodeCode Available | 2 |
| Pengi: An Audio Language Model for Audio Tasks | May 19, 2023 | Audio captioningAudio Question Answering | CodeCode Available | 2 |
| Visualizing Linguistic Diversity of Text Datasets Synthesized by Large Language Models | May 19, 2023 | BenchmarkingDiversity | CodeCode Available | 2 |
| Efficient Mixed Transformer for Single Image Super-Resolution | May 19, 2023 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings | May 19, 2023 | In-Context LearningQuestion Answering | CodeCode Available | 2 |
| TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series | May 19, 2023 | DiversitySynthetic Data Generation | CodeCode Available | 2 |
| PointGPT: Auto-regressively Generative Pre-training from Point Clouds | May 19, 2023 | 3D Point Cloud ClassificationDecoder | CodeCode Available | 2 |
| HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | May 19, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 2 |
| HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation | May 19, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images | May 18, 2023 | Active LearningDiagnostic | CodeCode Available | 2 |
| UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild | May 18, 2023 | Image Generation | CodeCode Available | 2 |
| Causal Document-Grounded Dialogue Pre-training | May 18, 2023 | | CodeCode Available | 2 |
| OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding | May 18, 2023 | 3D Classification3D Shape Representation | CodeCode Available | 2 |
| Structural Pruning for Diffusion Models | May 18, 2023 | | CodeCode Available | 2 |
| Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness | May 18, 2023 | CPUGPU | CodeCode Available | 2 |
| Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model | May 18, 2023 | Image GenerationLanguage Modeling | CodeCode Available | 2 |
| Segment Any Anomaly without Training via Hybrid Prompt Regularization | May 18, 2023 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| Going Denser with Open-Vocabulary Part Segmentation | May 18, 2023 | Objectobject-detection | CodeCode Available | 2 |
| 3D Registration with Maximal Cliques | May 18, 2023 | Point Cloud Registration | CodeCode Available | 2 |
| A Survey on Time-Series Pre-Trained Models | May 18, 2023 | SurveyTime Series | CodeCode Available | 2 |
| Listen, Think, and Understand | May 18, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | May 17, 2023 | DenoisingDiffusion Personalization | CodeCode Available | 2 |
| Evaluating Object Hallucination in Large Vision-Language Models | May 17, 2023 | HallucinationObject | CodeCode Available | 2 |
| Investigating image-based fallow weed detection performance on Raphanus sativus and Avena sativa at speeds up to 30 km h^-1 | May 17, 2023 | | CodeCode Available | 2 |
| TextSLAM: Visual SLAM with Semantic Planar Text Features | May 17, 2023 | Mixed RealityObject SLAM | CodeCode Available | 2 |
| Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks | May 17, 2023 | Graph ClassificationGraph Representation Learning | CodeCode Available | 2 |
| Rethinking the Open-Loop Evaluation of End-to-End Autonomous Driving in nuScenes | May 17, 2023 | Autonomous DrivingTrajectory Planning | CodeCode Available | 2 |
| Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback | May 17, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| MemoryBank: Enhancing Large Language Models with Long-Term Memory | May 17, 2023 | Chatbot | CodeCode Available | 2 |
| DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining | May 17, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| StructGPT: A General Framework for Large Language Model to Reason over Structured Data | May 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| AbdomenAtlas-8K: Annotating 8,000 CT Volumes for Multi-Organ Segmentation in Three Weeks | May 16, 2023 | 8kActive Learning | CodeCode Available | 2 |
| ICDAR 2023 Competition on Hierarchical Text Detection and Recognition | May 16, 2023 | Text Detection | CodeCode Available | 2 |
| CLRerNet: Improving Confidence of Lane Detection with LaneIoU | May 15, 2023 | Autonomous DrivingLane Detection | CodeCode Available | 2 |
| Interpretability at Scale: Identifying Causal Mechanisms in Alpaca | May 15, 2023 | | CodeCode Available | 2 |
| Identity-Preserving Talking Face Generation with Landmark and Appearance Priors | May 15, 2023 | Face GenerationTalking Face Generation | CodeCode Available | 2 |
| Denoising Diffusion Models for Plug-and-Play Image Restoration | May 15, 2023 | DeblurringDenoising | CodeCode Available | 2 |
| Large Language Models are Zero-Shot Rankers for Recommender Systems | May 15, 2023 | Recommendation Systems | CodeCode Available | 2 |
| NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation | May 15, 2023 | 3D human pose and shape estimation3D Human Pose Estimation | CodeCode Available | 2 |
| Common Diffusion Noise Schedules and Sample Steps are Flawed | May 15, 2023 | | CodeCode Available | 2 |
| Large Language Model Guided Tree-of-Thought | May 15, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing | May 15, 2023 | | CodeCode Available | 2 |
| Diffusion Models for Imperceptible and Transferable Adversarial Attack | May 14, 2023 | Adversarial Attack | CodeCode Available | 2 |
| ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | May 14, 2023 | 3D Classification3D Point Cloud Classification | CodeCode Available | 2 |
| OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models | May 13, 2023 | Key Information ExtractionNutrition | CodeCode Available | 2 |
| Benchmarks and leaderboards for sound demixing tasks | May 12, 2023 | | CodeCode Available | 2 |
| How to Index Item IDs for Recommendation Foundation Models | May 11, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| WebCPM: Interactive Web Search for Chinese Long-form Question Answering | May 11, 2023 | FormInformation Retrieval | CodeCode Available | 2 |
| An Inverse Scaling Law for CLIP Training | May 11, 2023 | | CodeCode Available | 2 |
| InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning | May 11, 2023 | 1 Image, 2*2 StitchingDiversity | CodeCode Available | 2 |