| End-to-End Modeling Hierarchical Time Series Using Autoregressive Transformer and Conditional Normalizing Flow based Reconciliation | Dec 28, 2022 | Multivariate Time Series ForecastingTime Series | CodeCode Available | 2 | 5 |
| H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models, Including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking | Feb 18, 2025 | | CodeCode Available | 2 | 5 |
| CGI-Stereo: Accurate and Real-Time Stereo Matching via Context and Geometry Interaction | Jan 7, 2023 | Stereo Matching | CodeCode Available | 2 | 5 |
| Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement | Jan 8, 2023 | DenoisingDisentanglement | CodeCode Available | 2 | 5 |
| Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie Worksheets | Jul 8, 2024 | HallucinationNavigate | CodeCode Available | 2 | 5 |
| FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models | May 8, 2024 | 3D Human Pose EstimationDenoising | CodeCode Available | 2 | 5 |
| On Evaluating Adversarial Robustness of Large Vision-Language Models | May 26, 2023 | Adversarial Robustnessmultimodal generation | CodeCode Available | 2 | 5 |
| Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator | Mar 3, 2025 | Image Generation | CodeCode Available | 2 | 5 |
| BackdoorBox: A Python Toolbox for Backdoor Learning | Feb 1, 2023 | | CodeCode Available | 2 | 5 |
| Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks | Oct 30, 2023 | Benchmarkingobject-detection | CodeCode Available | 2 | 5 |
| Raising the Cost of Malicious AI-Powered Image Editing | Feb 13, 2023 | | CodeCode Available | 2 | 5 |
| YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection | Feb 14, 2023 | Action Detection | CodeCode Available | 2 | 5 |
| Delivering Arbitrary-Modal Semantic Segmentation | Mar 2, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction | Mar 10, 2023 | 3D Interacting Hand Pose Estimation3D Reconstruction | CodeCode Available | 2 | 5 |
| General Place Recognition Survey: Towards Real-World Autonomy | May 8, 2024 | Simultaneous Localization and MappingSurvey | CodeCode Available | 2 | 5 |
| DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation | Mar 11, 2023 | Image Manipulation | CodeCode Available | 2 | 5 |
| LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Mar 14, 2023 | Layout Generationmodel | CodeCode Available | 2 | 5 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation | Mar 21, 2023 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| SHERF: Generalizable Human NeRF from a Single Image | Mar 22, 2023 | 3D Human ReconstructionNeRF | CodeCode Available | 2 | 5 |
| NOPE: Novel Object Pose Estimation from a Single Image | Mar 23, 2023 | ObjectPose Estimation | CodeCode Available | 2 | 5 |
| MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer | Mar 25, 2023 | Image Generation | CodeCode Available | 2 | 5 |
| SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling | Mar 30, 2023 | DiversityHuman Mesh Recovery | CodeCode Available | 2 | 5 |
| On the Benefits of 3D Pose and Tracking for Human Action Recognition | Apr 3, 2023 | Action RecognitionTemporal Action Localization | CodeCode Available | 2 | 5 |
| Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation | Apr 3, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting | Apr 2, 2023 | Point Cloud Registration | CodeCode Available | 2 | 5 |
| Detecting and Grounding Multi-Modal Media Manipulation | Apr 5, 2023 | Binary ClassificationContrastive Learning | CodeCode Available | 2 | 5 |
| Large Language Models Post-training: Surveying Techniques from Alignment to Reasoning | Mar 8, 2025 | Survey | CodeCode Available | 2 | 5 |
| Automatic Gradient Descent: Deep Learning without Hyperparameters | Apr 11, 2023 | Deep LearningSecond-order methods | CodeCode Available | 2 | 5 |
| Diffusion Recommender Model | Apr 11, 2023 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions | Apr 13, 2023 | Robust Camera Only 3D Object Detection | CodeCode Available | 2 | 5 |
| Heterogeneous-Agent Reinforcement Learning | Apr 19, 2023 | LEMMAMulti-agent Reinforcement Learning | CodeCode Available | 2 | 5 |
| Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra | Apr 19, 2023 | 3D geometry3D Reconstruction | CodeCode Available | 2 | 5 |
| SILVR: Guided Diffusion for Molecule Generation | Apr 21, 2023 | Drug Design | CodeCode Available | 2 | 5 |
| JaxPruner: A concise library for sparsity research | Apr 27, 2023 | | CodeCode Available | 2 | 5 |
| NeuralKG-ind: A Python Library for Inductive Knowledge Graph Representation Learning | Apr 28, 2023 | Graph Representation LearningKnowledge Graphs | CodeCode Available | 2 | 5 |
| Huatuo-26M, a Large-scale Chinese Medical QA Dataset | May 2, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| MAMCA -- Optimal on Accuracy and Efficiency for Automatic Modulation Classification with Extended Signal Length | May 18, 2024 | DenoisingGPU | CodeCode Available | 2 | 5 |
| TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning | Sep 18, 2024 | Fact VerificationQuestion Answering | CodeCode Available | 2 | 5 |
| A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training | May 27, 2024 | | CodeCode Available | 2 | 5 |
| DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images | May 18, 2023 | Active LearningDiagnostic | CodeCode Available | 2 | 5 |
| Causal Document-Grounded Dialogue Pre-training | May 18, 2023 | | CodeCode Available | 2 | 5 |
| Variational Learning is Effective for Large Deep Networks | Feb 27, 2024 | Sensitivity | CodeCode Available | 2 | 5 |
| Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach | May 23, 2023 | GPUImage Generation | CodeCode Available | 2 | 5 |
| NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models | May 26, 2023 | Instruction FollowingVision and Language Navigation | CodeCode Available | 2 | 5 |
| Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation | Nov 7, 2022 | Computed Tomography (CT)Denoising | CodeCode Available | 2 | 5 |
| Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models | Jun 1, 2023 | GPUImage Compression | CodeCode Available | 2 | 5 |
| LibAUC: A Deep Learning Library for X-Risk Optimization | Jun 5, 2023 | BenchmarkingClassification | CodeCode Available | 2 | 5 |
| STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection | Jun 5, 2023 | Face AlignmentFacial Landmark Detection | CodeCode Available | 2 | 5 |
| Estimating heterogeneous treatment effects with right-censored data via causal survival forests | Jan 27, 2020 | | CodeCode Available | 2 | 5 |