| Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning | Aug 15, 2024 | SegmentationVideo Segmentation | CodeCode Available | 2 | 5 |
| Adversarial attacks and defenses in explainable artificial intelligence: A survey | Jun 6, 2023 | Decision MakingExplainable artificial intelligence | CodeCode Available | 2 | 5 |
| XMem++: Production-level Video Segmentation From Few Annotated Frames | Jul 29, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| DeFoG: Discrete Flow Matching for Graph Generation | Oct 5, 2024 | DenoisingGraph Generation | CodeCode Available | 2 | 5 |
| Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models | Jul 7, 2024 | class-incremental learningClass Incremental Learning | CodeCode Available | 2 | 5 |
| WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks | Apr 22, 2025 | Benchmarking | CodeCode Available | 2 | 5 |
| FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators | Oct 5, 2023 | Image to Point Cloud RegistrationMetric Learning | CodeCode Available | 2 | 5 |
| Spikformer: When Spiking Neural Network Meets Transformer | Sep 29, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution Analysis | Jul 1, 2024 | Contrastive LearningData Augmentation | CodeCode Available | 2 | 5 |
| Vista3D: Unravel the 3D Darkside of a Single Image | Sep 18, 2024 | 3D GenerationDiversity | CodeCode Available | 2 | 5 |
| FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting | Jan 30, 2022 | Time SeriesTime Series Analysis | CodeCode Available | 2 | 5 |
| Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning | Dec 4, 2024 | Federated Learning | CodeCode Available | 2 | 5 |
| Going Denser with Open-Vocabulary Part Segmentation | May 18, 2023 | Objectobject-detection | CodeCode Available | 2 | 5 |
| Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels | Mar 18, 2025 | GPULanguage Modeling | CodeCode Available | 2 | 5 |
| Proactive Model Adaptation Against Concept Drift for Online Time Series Forecasting | Dec 11, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 | 5 |
| Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction | Sep 12, 2024 | 3D geometry | CodeCode Available | 2 | 5 |
| GLACE: Global Local Accelerated Coordinate Encoding | Jun 6, 2024 | Camera Pose EstimationPose Estimation | CodeCode Available | 2 | 5 |
| ECG-Image-Database: A Dataset of ECG Images with Real-World Imaging and Scanning Artifacts; A Foundation for Computerized ECG Image Digitization and Analysis | Sep 25, 2024 | ECG DigitizationTime Series | CodeCode Available | 2 | 5 |
| Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity | May 22, 2024 | Language ModellingModel Editing | CodeCode Available | 2 | 5 |
| Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models | Jun 5, 2024 | DiversityLanguage Modeling | CodeCode Available | 2 | 5 |
| Graph Neural Networks for Multimodal Single-Cell Data Integration | Mar 3, 2022 | Data IntegrationGraph Neural Network | CodeCode Available | 2 | 5 |
| Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis | Oct 30, 2024 | Speech Synthesistext-to-speech | CodeCode Available | 2 | 5 |
| Z*: Zero-shot Style Transfer via Attention Reweighting | Jan 1, 2024 | DenoisingImage Denoising | CodeCode Available | 2 | 5 |
| MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset | Jun 29, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| Generative Flows on Synthetic Pathway for Drug Design | Oct 6, 2024 | Drug DesignDrug Discovery | CodeCode Available | 2 | 5 |
| Visual Prompt Multi-Modal Tracking | Mar 20, 2023 | Object TrackingPrompt Learning | CodeCode Available | 2 | 5 |
| RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval | Sep 16, 2024 | CPUGPU | CodeCode Available | 2 | 5 |
| Tree of Attacks: Jailbreaking Black-Box LLMs Automatically | Dec 4, 2023 | Navigate | CodeCode Available | 2 | 5 |
| Attention Mechanisms in Computer Vision: A Survey | Nov 15, 2021 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis | Jul 4, 2023 | Image Generation | CodeCode Available | 2 | 5 |
| DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models | Jul 5, 2023 | Object | CodeCode Available | 2 | 5 |
| Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System | Mar 11, 2024 | GPULanguage Modeling | CodeCode Available | 2 | 5 |
| Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models | Apr 16, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents | Jul 12, 2024 | Information RetrievalQuestion Answering | CodeCode Available | 2 | 5 |
| Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation | Jul 15, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 | 5 |
| Multi-Representation Adaptation Network for Cross-domain Image Classification | Jan 4, 2022 | ClassificationDomain Adaptation | CodeCode Available | 2 | 5 |
| Anomaly Detection via Reverse Distillation from One-Class Embedding | Jan 26, 2022 | Anomaly Classification | CodeCode Available | 2 | 5 |
| Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure | Sep 4, 2024 | Crack SegmentationDecoder | CodeCode Available | 2 | 5 |
| LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning | Jun 12, 2024 | text-to-speechText to Speech | CodeCode Available | 2 | 5 |
| Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Mar 15, 2024 | Motion Synthesis | CodeCode Available | 2 | 5 |
| Generalizable Human Gaussians for Sparse View Synthesis | Jul 17, 2024 | NeRFNeural Rendering | CodeCode Available | 2 | 5 |
| Protein structure generation via folding diffusion | Sep 30, 2022 | DenoisingProtein Structure Prediction | CodeCode Available | 2 | 5 |
| Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network | Aug 10, 2024 | geo-localizationImage Retrieval | CodeCode Available | 2 | 5 |
| Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels | Mar 5, 2024 | Pseudo LabelSemantic Segmentation | CodeCode Available | 2 | 5 |
| ARoFace: Alignment Robustness to Improve Low-Quality Face Recognition | Jul 20, 2024 | Data AugmentationFace Alignment | CodeCode Available | 2 | 5 |
| HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation | Apr 9, 2023 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| Liquid Structural State-Space Models | Sep 26, 2022 | Heart rate estimationLong-range modeling | CodeCode Available | 2 | 5 |
| STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting | Jun 7, 2024 | motion retargeting | CodeCode Available | 2 | 5 |
| SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description | Aug 24, 2024 | DescriptiveSpeech Synthesis | CodeCode Available | 2 | 5 |
| One Transformer Can Understand Both 2D & 3D Molecular Data | Oct 4, 2022 | Graph Regressionmolecular representation | CodeCode Available | 2 | 5 |