| TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments | Jun 5, 2023 | 3D Human Pose Estimationregression | CodeCode Available | 3 |
| LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning | Jan 1, 2024 | 3D dense captioningDense Captioning | CodeCode Available | 3 |
| InstructIE: A Bilingual Instruction-based Information Extraction Dataset | May 19, 2023 | | CodeCode Available | 3 |
| Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model | Jun 2, 2024 | DenoisingMixture-of-Experts | CodeCode Available | 3 |
| FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally | Sep 12, 2024 | | CodeCode Available | 3 |
| Vision-based 3D occupancy prediction in autonomous driving: a review and outlook | May 4, 2024 | Autonomous DrivingPrediction | CodeCode Available | 3 |
| Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | Jun 8, 2023 | Question AnsweringVCGBench-Diverse | CodeCode Available | 3 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg | Jan 1, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |
| Drone Data Analytics for Measuring Traffic Metrics at Intersections in High-Density Areas | Nov 4, 2024 | | CodeCode Available | 3 |
| A Survey on Video Action Recognition in Sports: Datasets, Methods and Applications | Jun 2, 2022 | Action RecognitionSports Analytics | CodeCode Available | 3 |
| UrbanGPT: Spatio-Temporal Large Language Models | Feb 25, 2024 | 10-shot image generation | CodeCode Available | 3 |
| Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models | May 5, 2025 | Policy Gradient MethodsRAG | CodeCode Available | 3 |
| ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation | Apr 26, 2022 | 2D Human Pose EstimationKeypoint Detection | CodeCode Available | 3 |
| DarkIR: Robust Low-Light Image Restoration | Dec 18, 2024 | DeblurringImage Enhancement | CodeCode Available | 3 |
| All are Worth Words: A ViT Backbone for Diffusion Models | Sep 25, 2022 | AllConditional Image Generation | CodeCode Available | 3 |
| Robust Self-calibration of Focal Lengths from the Fundamental Matrix | Nov 27, 2023 | | CodeCode Available | 3 |
| TensorNEAT: A GPU-accelerated Library for NeuroEvolution of Augmenting Topologies | Apr 11, 2025 | Computational EfficiencyGPU | CodeCode Available | 3 |
| VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis | Feb 27, 2024 | Contrastive LearningMedical Image Analysis | CodeCode Available | 3 |
| Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior | Mar 24, 2023 | 3D geometryText to 3D | CodeCode Available | 3 |
| Deep Learning in Single-Cell Analysis | Oct 22, 2022 | Cell SegmentationDeep Learning | CodeCode Available | 3 |
| AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions | Oct 27, 2024 | Feature Engineering | CodeCode Available | 3 |
| PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images | Jun 2, 2022 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping | May 27, 2024 | Depth EstimationDiversity | CodeCode Available | 3 |
| BoostTrack++: using tracklet information to detect more objects in multiple object tracking | Aug 23, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 |
| MureObjectStitch: Multi-reference Image Composition | Nov 12, 2024 | Object | CodeCode Available | 3 |
| Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis | May 16, 2025 | Continual LearningRepresentation Learning | CodeCode Available | 3 |
| HAT: Hybrid Attention Transformer for Image Restoration | Sep 11, 2023 | DenoisingImage Compression | CodeCode Available | 3 |
| Cleaner Pretraining Corpus Curation with Neural Web Scraping | Feb 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution | Nov 27, 2024 | Image RestorationImage Super-Resolution | CodeCode Available | 3 |
| DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps | Jun 2, 2022 | | CodeCode Available | 3 |
| HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios | May 31, 2024 | Autonomous Drivingreinforcement-learning | CodeCode Available | 3 |
| Cross Modal Transformer: Towards Fast and Robust 3D Object Detection | Jan 3, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 |
| Spider: A Unified Framework for Context-dependent Concept Segmentation | May 2, 2024 | Transparent objects | CodeCode Available | 3 |
| Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia | Dec 6, 2023 | Common Sense Reasoning | CodeCode Available | 3 |
| MV-VTON: Multi-View Virtual Try-On with Diffusion Models | Apr 26, 2024 | Virtual Try-on | CodeCode Available | 3 |
| MineStudio: A Streamlined Package for Minecraft AI Agent Development | Dec 24, 2024 | AI AgentDecision Making | CodeCode Available | 3 |
| SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting | Apr 23, 2024 | MambaTime Series | CodeCode Available | 3 |
| iDisc: Internal Discretization for Monocular Depth Estimation | Apr 13, 2023 | Autonomous DrivingDepth Estimation | CodeCode Available | 3 |
| Segment Anything without Supervision | Jun 28, 2024 | ClusteringImage Segmentation | CodeCode Available | 3 |
| ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control | Jun 3, 2024 | Speech Synthesistext-to-speech | CodeCode Available | 3 |
| Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss | Oct 22, 2024 | GPURepresentation Learning | CodeCode Available | 3 |
| Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling | Feb 10, 2025 | Math | CodeCode Available | 3 |
| OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving | Feb 6, 2024 | Autonomous DrivingNeural Rendering | CodeCode Available | 3 |
| Refusal in Language Models Is Mediated by a Single Direction | Jun 17, 2024 | Instruction Following | CodeCode Available | 3 |
| A Survey of Embodied Learning for Object-Centric Robotic Manipulation | Aug 21, 2024 | Imitation LearningObject | CodeCode Available | 3 |
| Efficient Reasoning Models: A Survey | Apr 15, 2025 | Knowledge DistillationModel Compression | CodeCode Available | 3 |
| Retentive Network: A Successor to Transformer for Large Language Models | Jul 17, 2023 | GPULanguage Modeling | CodeCode Available | 3 |
| GLM: General Language Model Pretraining with Autoregressive Blank Infilling | Mar 18, 2021 | Abstractive Text SummarizationClassification | CodeCode Available | 3 |
| A Survey on Self-Supervised Learning for Non-Sequential Tabular Data | Feb 2, 2024 | Contrastive LearningDescriptive | CodeCode Available | 3 |
| A Comprehensive Survey on Long Context Language Modeling | Mar 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |