| A Unified Model for Multi-class Anomaly Detection | Jun 8, 2022 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 | 5 |
| Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation | May 26, 2024 | feature selectionMixture-of-Experts | CodeCode Available | 2 | 5 |
| Spurious Forgetting in Continual Learning of Language Models | Jan 23, 2025 | Continual Learning | CodeCode Available | 2 | 5 |
| NeRF On-the-go: Exploiting Uncertainty for Distractor-free NeRFs in the Wild | May 29, 2024 | NeRF | CodeCode Available | 2 | 5 |
| Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform | Oct 28, 2022 | CPUKnowledge Distillation | CodeCode Available | 2 | 5 |
| Joint Spatio-Temporal Modeling for the Semantic Change Detection in Remote Sensing Images | Dec 10, 2022 | Change Detection | CodeCode Available | 2 | 5 |
| Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs) | Oct 10, 2024 | EEGText Generation | CodeCode Available | 2 | 5 |
| Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning | Feb 27, 2023 | Dense Video CaptioningLanguage Modeling | CodeCode Available | 2 | 5 |
| DiffIR: Efficient Diffusion Model for Image Restoration | Mar 16, 2023 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models | Oct 16, 2023 | General Reinforcement LearningGPU | CodeCode Available | 2 | 5 |
| Grounding Language Models to Images for Multimodal Inputs and Outputs | Jan 31, 2023 | Image RetrievalIn-Context Learning | CodeCode Available | 2 | 5 |
| Agent models: Internalizing Chain-of-Action Generation into Reasoning models | Mar 9, 2025 | Action GenerationReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Diffusion Models in Vision: A Survey | Sep 10, 2022 | ArticlesDenoising | CodeCode Available | 2 | 5 |
| Exploring Visual Prompts for Adapting Large-Scale Models | Mar 31, 2022 | Visual Prompting | CodeCode Available | 2 | 5 |
| Power Bundle Adjustment for Large-Scale 3D Reconstruction | Apr 27, 2022 | 3D ReconstructionDistributed Optimization | CodeCode Available | 2 | 5 |
| TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning | Apr 14, 2024 | Dense Video CaptioningDescriptive | CodeCode Available | 2 | 5 |
| Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review | Nov 3, 2023 | Diagnostic | CodeCode Available | 2 | 5 |
| Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library | Nov 29, 2022 | | CodeCode Available | 2 | 5 |
| Multi-Fidelity Active Learning with GFlowNets | Jun 20, 2023 | Active LearningBayesian Optimization | CodeCode Available | 2 | 5 |
| Full Parameter Fine-tuning for Large Language Models with Limited Resources | Jun 16, 2023 | GPUparameter-efficient fine-tuning | CodeCode Available | 2 | 5 |
| Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages | Oct 18, 2022 | Information RetrievalRetrieval | CodeCode Available | 2 | 5 |
| Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing | Apr 8, 2025 | DeepFake DetectionDimensionality Reduction | CodeCode Available | 2 | 5 |
| VRP-SAM: SAM with Visual Reference Prompt | Feb 27, 2024 | Meta-LearningSegmentation | CodeCode Available | 2 | 5 |
| BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAM | Aug 15, 2022 | 4kSimultaneous Localization and Mapping | CodeCode Available | 2 | 5 |
| MeMemo: On-device Retrieval Augmentation for Private and Personalized Text Generation | Jul 2, 2024 | HallucinationRAG | CodeCode Available | 2 | 5 |
| How far are today's time-series models from real-world weather forecasting applications? | Jun 20, 2024 | BenchmarkingTime Series | CodeCode Available | 2 | 5 |
| Point Cloud Mamba: Point Cloud Learning via State Space Model | Mar 1, 2024 | MambaState Space Models | CodeCode Available | 2 | 5 |
| Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme | Apr 3, 2025 | Reinforcement Learning (RL)Visual Reasoning | CodeCode Available | 2 | 5 |
| Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets | Jun 5, 2025 | | CodeCode Available | 2 | 5 |
| HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding | Apr 20, 2024 | cross-modal alignmentVisual Grounding | CodeCode Available | 2 | 5 |
| D-CIPHER: Dynamic Collaborative Intelligent Multi-Agent System with Planner and Heterogeneous Executors for Offensive Security | Feb 15, 2025 | Task Planning | CodeCode Available | 2 | 5 |
| Learning Diffusion Priors from Observations by Expectation Maximization | May 22, 2024 | | CodeCode Available | 2 | 5 |
| Emotionally Enhanced Talking Face Generation | Mar 21, 2023 | Face GenerationTalking Face Generation | CodeCode Available | 2 | 5 |
| Vision Transformer with Quadrangle Attention | Mar 27, 2023 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images | Aug 15, 2022 | NeRFObject | CodeCode Available | 2 | 5 |
| FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation | May 23, 2023 | FormLanguage Modelling | CodeCode Available | 2 | 5 |
| AbdomenAtlas-8K: Annotating 8,000 CT Volumes for Multi-Organ Segmentation in Three Weeks | May 16, 2023 | 8kActive Learning | CodeCode Available | 2 | 5 |
| SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning | Nov 15, 2024 | Image Quality AssessmentLanguage Modeling | CodeCode Available | 2 | 5 |
| Towards Metrical Reconstruction of Human Faces | Apr 13, 2022 | 2k3D Face Reconstruction | CodeCode Available | 2 | 5 |
| Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus | Nov 19, 2024 | Formal LogicLogical Reasoning | CodeCode Available | 2 | 5 |
| TorchAudio: Building Blocks for Audio and Speech Processing | Oct 28, 2021 | BIG-bench Machine LearningGPU | CodeCode Available | 2 | 5 |
| Deep Learning Accelerated Quantum Transport Simulations in Nanoelectronics: From Break Junctions to Field-Effect Transistors | Nov 13, 2024 | Computational Efficiency | CodeCode Available | 2 | 5 |
| Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification | Sep 1, 2024 | Scene ClassificationTransductive Zero-Shot Classification | CodeCode Available | 2 | 5 |
| Software package for simulations using the coarse-grained CALVADOS model | Apr 14, 2025 | | CodeCode Available | 2 | 5 |
| Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval | Mar 7, 2022 | Information RetrievalPassage Retrieval | CodeCode Available | 2 | 5 |
| FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure Graph Perspective | Nov 10, 2023 | Graph Neural NetworkMultivariate Time Series Forecasting | CodeCode Available | 2 | 5 |
| Interactive4D: Interactive 4D LiDAR Segmentation | Oct 10, 2024 | Interactive SegmentationSegmentation | CodeCode Available | 2 | 5 |
| Prototypical Networks for Few-shot Learning | Mar 15, 2017 | Category-Agnostic Pose EstimationFew-Shot Image Classification | CodeCode Available | 2 | 5 |
| BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo | Sep 21, 2022 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 | 5 |
| Language is All a Graph Needs | Aug 14, 2023 | AllGraph Learning | CodeCode Available | 2 | 5 |