| Fast Sparse View Guided NeRF Update for Object Reconfigurations | Mar 16, 2024 | NeRF | CodeCode Available | 2 |
| DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving | Sep 18, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs | Feb 7, 2023 | Graph Neural Network | CodeCode Available | 2 |
| CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation | Mar 3, 2022 | 3D Reconstruction3D Shape Reconstruction | CodeCode Available | 2 |
| Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition | Nov 28, 2024 | Action RecognitionSkeleton Based Action Recognition | CodeCode Available | 2 |
| Expected Shapley-Like Scores of Boolean Functions: Complexity and Applications to Probabilistic Databases | Jan 12, 2024 | | CodeCode Available | 2 |
| Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI | Apr 3, 2022 | | CodeCode Available | 2 |
| Raspberry PI Shield - for measure EEG (PIEEG) | Feb 16, 2022 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors | Jun 27, 2022 | 3D ReconstructionNeural Rendering | CodeCode Available | 2 |
| Large Language Models are Geographically Biased | Feb 5, 2024 | Fairness | CodeCode Available | 2 |
| Hypergraph Isomorphism Computation | Jul 26, 2023 | Community DetectionGraph Classification | CodeCode Available | 2 |
| Balanced Audiovisual Dataset for Imbalance Analysis | Feb 14, 2023 | | CodeCode Available | 2 |
| GenAI Arena: An Open Evaluation Platform for Generative Models | Jun 6, 2024 | Image GenerationInstruction Following | CodeCode Available | 2 |
| A deep learning based steganography integration framework for ad-hoc cloud computing data security augmentation using the V-BOINC system | Dec 21, 2022 | Cloud Computing | CodeCode Available | 2 |
| Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network | Jun 24, 2018 | | CodeCode Available | 2 |
| Multi-granularity Correspondence Learning from Long-term Noisy Videos | Jan 30, 2024 | Action SegmentationLong Video Retrieval (Background Removed) | CodeCode Available | 2 |
| Learning Neural Parametric Head Models | Dec 6, 2022 | | CodeCode Available | 2 |
| Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Jun 5, 2025 | GPUSemantic Segmentation | CodeCode Available | 2 |
| A Robust Online Multi-Camera People Tracking System With Geometric Consistency and State-aware Re-ID Correction | Jun 17, 2024 | Multi-Object TrackingMultiple People Tracking | CodeCode Available | 2 |
| Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process | Jul 29, 2024 | GSM8KMath | CodeCode Available | 2 |
| Dual Diffusion for Unified Image Generation and Understanding | Dec 31, 2024 | Image GenerationLanguage Modeling | CodeCode Available | 2 |
| BodyGen: Advancing Towards Efficient Embodiment Co-Design | Mar 1, 2025 | | CodeCode Available | 2 |
| BITS: Bi-level Imitation for Traffic Simulation | Aug 26, 2022 | Autonomous VehiclesDiversity | CodeCode Available | 2 |
| TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction | May 27, 2024 | MambaPrediction | CodeCode Available | 2 |
| Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis | Dec 3, 2024 | Image Generation | CodeCode Available | 2 |
| FUTR3D: A Unified Sensor Fusion Framework for 3D Detection | Mar 20, 2022 | Autonomous DrivingDecoder | CodeCode Available | 2 |
| UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition | Aug 7, 2023 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 2 |
| TD-MPC2: Scalable, Robust World Models for Continuous Control | Oct 25, 2023 | continuous-controlContinuous Control | CodeCode Available | 2 |
| TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens | Oct 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation | Oct 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Phenaki: Variable Length Video Generation From Open Domain Textual Description | Oct 5, 2022 | DecoderVideo Generation | CodeCode Available | 2 |
| 1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022) | Jun 23, 2022 | Data AugmentationVision and Language Navigation | CodeCode Available | 2 |
| Training-Free Consistent Text-to-Image Generation | Feb 5, 2024 | DiversityImage Generation | CodeCode Available | 2 |
| Efficient Differentiable Simulation of Articulated Bodies | Sep 16, 2021 | | CodeCode Available | 2 |
| DiffBP: Generative Diffusion of 3D Molecules for Target Protein Binding | Nov 21, 2022 | Drug Discovery | CodeCode Available | 2 |
| Multi-Interest Network with Dynamic Routing for Recommendation at Tmall | Apr 17, 2019 | ClusteringInformation Retrieval | CodeCode Available | 2 |
| Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Jan 25, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy | Mar 2, 2023 | Motion Planning | CodeCode Available | 2 |
| Fine-Grained Stochastic Architecture Search | Jun 17, 2020 | Neural Architecture Searchobject-detection | CodeCode Available | 2 |
| ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes | Aug 22, 2023 | 3D Semantic SegmentationNovel View Synthesis | CodeCode Available | 2 |
| Foundational Challenges in Assuring Alignment and Safety of Large Language Models | Apr 15, 2024 | | CodeCode Available | 2 |
| Is Weakly-supervised Action Segmentation Ready For Human-Robot Interaction? No, Let's Improve It With Action-union Learning | Oct 22, 2023 | Action RecognitionAction Segmentation | CodeCode Available | 2 |
| Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA | May 9, 2025 | | CodeCode Available | 2 |
| Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis | Mar 14, 2023 | 3D Point Cloud ClassificationAll | CodeCode Available | 2 |
| Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead | Mar 31, 2025 | MathSpatial Reasoning | CodeCode Available | 2 |
| VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? | Apr 9, 2024 | Optical Character Recognition (OCR) | CodeCode Available | 2 |
| Reducing Transformer Key-Value Cache Size with Cross-Layer Attention | May 21, 2024 | | CodeCode Available | 2 |
| InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs | May 18, 2020 | AttributeFace Generation | CodeCode Available | 2 |
| Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery | Apr 3, 2025 | Field Boundary DelineationInstance Segmentation | CodeCode Available | 2 |
| Flexible Isosurface Extraction for Gradient-Based Mesh Optimization | Aug 10, 2023 | | CodeCode Available | 2 |