| Excess Mass Estimates and Tests for Multimodality | Sep 1, 1991 | | CodeCode Available | 2 |
| Recommender Systems with Generative Retrieval | May 8, 2023 | Recommendation SystemsRetrieval | CodeCode Available | 2 |
| BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning | Apr 4, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| CausalVAE: Structured Causal Disentanglement in Variational Autoencoder | Apr 18, 2020 | counterfactualDisentanglement | CodeCode Available | 2 |
| Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers | Nov 8, 2023 | | CodeCode Available | 2 |
| Some things are more CRINGE than others: Iterative Preference Optimization with the Pairwise Cringe Loss | Dec 27, 2023 | | CodeCode Available | 2 |
| Depth Field Networks for Generalizable Multi-view Scene Representation | Jul 28, 2022 | Data AugmentationDepth Estimation | CodeCode Available | 2 |
| Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior | Apr 10, 2024 | 3D GenerationModel Optimization | CodeCode Available | 2 |
| Diffsound: Discrete Diffusion Model for Text-to-sound Generation | Jul 20, 2022 | Audio GenerationDecoder | CodeCode Available | 2 |
| BitNet: Scaling 1-bit Transformers for Large Language Models | Oct 17, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Oct 25, 2024 | State Space ModelsTime Series | CodeCode Available | 2 |
| StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images | Jun 19, 2024 | Object RecognitionScene Understanding | CodeCode Available | 2 |
| PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection | Oct 10, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation | Mar 25, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement | Apr 2, 2025 | DecoderImage Generation | CodeCode Available | 2 |
| Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs | Mar 31, 2025 | Large Language ModelVideo Chaptering | CodeCode Available | 2 |
| eRST: A Signaled Graph Theory of Discourse Relations and Organization | Mar 20, 2024 | | CodeCode Available | 2 |
| self-prompting analogical reasoning for uav object detection | Apr 11, 2025 | graph constructionobject-detection | CodeCode Available | 2 |
| SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations | May 4, 2025 | Data Augmentation | CodeCode Available | 2 |
| Explainable AI in Spatial Analysis | May 1, 2025 | Bias DetectionExplainable artificial intelligence | CodeCode Available | 2 |
| AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model | Aug 2, 2022 | Causal Language ModelingCommon Sense Reasoning | CodeCode Available | 2 |
| Meta-Design Matters: A Self-Design Multi-Agent System | May 21, 2025 | MathProblem Decomposition | CodeCode Available | 2 |
| One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory | May 29, 2025 | Contrastive LearningText Retrieval | CodeCode Available | 2 |
| GSPMD: General and Scalable Parallelization for ML Computation Graphs | May 10, 2021 | Playing the Game of 2048 | CodeCode Available | 2 |
| The More You See in 2D, the More You Perceive in 3D | Apr 4, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 2 |
| SpreadsheetLLM: Encoding Spreadsheets for Large Language Models | Jul 12, 2024 | In-Context LearningTable Detection | CodeCode Available | 2 |
| Multi-Grained Angle Representation for Remote Sensing Object Detection | Sep 7, 2022 | Objectobject-detection | CodeCode Available | 2 |
| What Makes a Good Diffusion Planner for Decision Making? | Mar 1, 2025 | Action GenerationDecision Making | CodeCode Available | 2 |
| Tightly-Coupled LiDAR-IMU-Leg Odometry with Online Learned Leg Kinematics Incorporating Foot Tactile Information | Jun 11, 2025 | | CodeCode Available | 2 |
| 4-bit Conformer with Native Quantization Aware Training for Speech Recognition | Mar 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| MVDream: Multi-view Diffusion for 3D Generation | Aug 31, 2023 | 3D GenerationPrompt Learning | CodeCode Available | 2 |
| Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning | Jun 14, 2024 | | CodeCode Available | 2 |
| Scaling Down Text Encoders of Text-to-Image Diffusion Models | Mar 25, 2025 | GPUImage Generation | CodeCode Available | 2 |
| Fully Geometric Panoramic Localization | Mar 29, 2024 | Indoor LocalizationVisual Localization | CodeCode Available | 2 |
| Find Any Part in 3D | Nov 20, 2024 | 3D Part SegmentationDiversity | CodeCode Available | 2 |
| GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting | May 13, 2024 | 3D scene EditingVirtual Try-on | CodeCode Available | 2 |
| AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control | Apr 5, 2021 | Imitation LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| PaLM-E: An Embodied Multimodal Language Model | Mar 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations | Sep 22, 2016 | GPU | CodeCode Available | 2 |
| Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration | Jul 7, 2025 | Optical Character Recognition (OCR) | CodeCode Available | 2 |
| PRAM: Place Recognition Anywhere Model for Efficient Visual Localization | Apr 11, 2024 | Autonomous DrivingLandmark Recognition | CodeCode Available | 2 |
| Learning to Predict Without Looking Ahead: World Models Without Forward Prediction | Oct 29, 2019 | Model-based Reinforcement Learningreinforcement-learning | CodeCode Available | 2 |
| P2Object: Single Point Supervised Object Detection and Instance Segmentation | Apr 10, 2025 | Instance SegmentationMultiple Instance Learning | CodeCode Available | 2 |
| The Revolution of Multimodal Large Language Models: A Survey | Feb 19, 2024 | Image GenerationInstruction Following | CodeCode Available | 2 |
| SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views | Jun 12, 2022 | Neural RenderingSurface Reconstruction | CodeCode Available | 2 |
| RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework | Sep 18, 2024 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| CodeSAM: Source Code Representation Learning by Infusing Self-Attention with Multi-Code-View Graphs | Nov 21, 2024 | Clone DetectionCode Search | CodeCode Available | 2 |
| Imagine while Reasoning in Space: Multimodal Visualization-of-Thought | Jan 13, 2025 | Spatial Reasoning | CodeCode Available | 2 |
| Vikhr: Constructing a State-of-the-art Bilingual Open-Source Instruction-Following Large Language Model for Russian | May 22, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Uncertainty Quantification in Scientific Machine Learning: Methods, Metrics, and Comparisons | Jan 19, 2022 | BIG-bench Machine LearningUncertainty Quantification | CodeCode Available | 2 |