| Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark] | Aug 24, 2023 | ManagementPrediction | CodeCode Available | 3 |
| Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs | Aug 23, 2023 | counterfactualQuestion Answering | CodeCode Available | 3 |
| StableVideo: Text-driven Consistency-aware Diffusion Video Editing | Aug 18, 2023 | Video Editing | CodeCode Available | 3 |
| OctoPack: Instruction Tuning Code Large Language Models | Aug 14, 2023 | Code GenerationCode Repair | CodeCode Available | 3 |
| EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models | Aug 14, 2023 | knowledge editing | CodeCode Available | 3 |
| ModelScope Text-to-Video Technical Report | Aug 12, 2023 | DenoisingImage Generation | CodeCode Available | 3 |
| MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction | Aug 10, 2023 | Autonomous DrivingOnline Vectorized HD Map Construction | CodeCode Available | 3 |
| Separate Anything You Describe | Aug 9, 2023 | Audio Source SeparationNatural Language Queries | CodeCode Available | 3 |
| On the use of deep learning for phase recovery | Aug 2, 2023 | Deep Learning | CodeCode Available | 3 |
| Causal-learn: Causal Discovery in Python | Jul 31, 2023 | Causal Discovery | CodeCode Available | 3 |
| Evaluating Large Language Models for Radiology Natural Language Processing | Jul 25, 2023 | | CodeCode Available | 3 |
| WebArena: A Realistic Web Environment for Building Autonomous Agents | Jul 25, 2023 | | CodeCode Available | 3 |
| 3D-LLM: Injecting the 3D World into Large Language Models | Jul 24, 2023 | 3D Object Captioning3D Question Answering (3D-QA) | CodeCode Available | 3 |
| ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting | Jul 23, 2023 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 3 |
| Meta-Transformer: A Unified Framework for Multimodal Learning | Jul 20, 2023 | Time Series | CodeCode Available | 3 |
| TokenFlow: Consistent Diffusion Features for Consistent Video Editing | Jul 19, 2023 | Video Editing | CodeCode Available | 3 |
| Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis | Jul 18, 2023 | NeRF | CodeCode Available | 3 |
| RepViT: Revisiting Mobile CNN From ViT Perspective | Jul 18, 2023 | | CodeCode Available | 3 |
| Retentive Network: A Successor to Transformer for Large Language Models | Jul 17, 2023 | GPULanguage Modeling | CodeCode Available | 3 |
| Secrets of RLHF in Large Language Models Part I: PPO | Jul 11, 2023 | | CodeCode Available | 3 |
| Objaverse-XL: A Universe of 10M+ 3D Objects | Jul 11, 2023 | DiversityNovel View Synthesis | CodeCode Available | 3 |
| Emu: Generative Pretraining in Multimodality | Jul 11, 2023 | Image CaptioningImage Generation | CodeCode Available | 3 |
| SVIT: Scaling up Visual Instruction Tuning | Jul 9, 2023 | DiversityImage Captioning | CodeCode Available | 3 |
| Focused Transformer: Contrastive Training for Context Scaling | Jul 6, 2023 | Contrastive Learning | CodeCode Available | 3 |
| A Survey on Evaluation of Large Language Models | Jul 6, 2023 | EthicsSurvey | CodeCode Available | 3 |
| OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models | Jul 5, 2023 | | CodeCode Available | 3 |
| DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection | Jul 4, 2023 | DeepFake DetectionFace Swapping | CodeCode Available | 3 |
| Segment Anything Meets Point Tracking | Jul 3, 2023 | Interactive Video Object SegmentationObject | CodeCode Available | 3 |
| CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning | Jun 30, 2023 | Causal InferenceMedical Report Generation | CodeCode Available | 3 |
| Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors | Jun 30, 2023 | Image to 3D | CodeCode Available | 3 |
| DisCo: Disentangled Control for Realistic Human Dance Generation | Jun 30, 2023 | Attribute | CodeCode Available | 3 |
| One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization | Jun 29, 2023 | 3D ReconstructionImage to 3D | CodeCode Available | 3 |
| DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing | Jun 26, 2023 | | CodeCode Available | 3 |
| MotionGPT: Human Motion as a Foreign Language | Jun 26, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| ViNT: A Foundation Model for Visual Navigation | Jun 26, 2023 | modelVisual Navigation | CodeCode Available | 3 |
| Improving visual image reconstruction from human brain activity using latent diffusion models via multiple decoded inputs | Jun 20, 2023 | Deep LearningImage Reconstruction | CodeCode Available | 3 |
| Opportunities and Risks of LLMs for Scalable Deliberation with Polis | Jun 20, 2023 | | CodeCode Available | 3 |
| GlyphNet: Homoglyph domains dataset and detection using attention-based Convolutional Neural Networks | Jun 17, 2023 | Binary Classification | CodeCode Available | 3 |
| Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration | Jun 15, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement | Jun 14, 2023 | GPUMotion Estimation | CodeCode Available | 3 |
| WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow | Jun 12, 2023 | | CodeCode Available | 3 |
| High-Fidelity Audio Compression with Improved RVQGAN | Jun 11, 2023 | Audio CompressionAudio Generation | CodeCode Available | 3 |
| Interpretable Differencing of Machine Learning Models | Jun 10, 2023 | Classification | CodeCode Available | 3 |
| How Can Recommender Systems Benefit from Large Language Models: A Survey | Jun 9, 2023 | EthicsFeature Engineering | CodeCode Available | 3 |
| Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | Jun 8, 2023 | Question AnsweringVCGBench-Diverse | CodeCode Available | 3 |
| Designing a Better Asymmetric VQGAN for StableDiffusion | Jun 7, 2023 | DecoderImage Generation | CodeCode Available | 3 |
| SAM3D: Segment Anything in 3D Scenes | Jun 6, 2023 | Segmentation | CodeCode Available | 3 |
| LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning | Jun 5, 2023 | Benchmarking | CodeCode Available | 3 |
| TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments | Jun 5, 2023 | 3D Human Pose Estimationregression | CodeCode Available | 3 |