| Transformer tricks: Removing weights for skipless transformers | Apr 18, 2024 | | CodeCode Available | 2 |
| Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph | Mar 14, 2024 | 3D Generation3DGS | CodeCode Available | 2 |
| Listen, Think, and Understand | May 18, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Pre-training Differentially Private Models with Limited Public Data | Feb 28, 2024 | TAG | CodeCode Available | 2 |
| Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling | Jul 8, 2024 | DenoisingImage Inpainting | CodeCode Available | 2 |
| BARS: Towards Open Benchmarking for Recommender Systems | May 19, 2022 | BenchmarkingClick-Through Rate Prediction | CodeCode Available | 2 |
| RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models | Oct 17, 2024 | Image CaptioningQuestion Answering | CodeCode Available | 2 |
| Optimal Invariant Bases for Atomistic Machine Learning | Mar 30, 2025 | | CodeCode Available | 2 |
| Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration | May 31, 2024 | Deformable Medical Image RegistrationImage Registration | CodeCode Available | 2 |
| UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation | Dec 8, 2022 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding | Nov 6, 2024 | Image ComprehensionStreaming video understanding | CodeCode Available | 2 |
| Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba | Jul 12, 2024 | 3D Hand Pose EstimationMamba | CodeCode Available | 2 |
| Light and Optimal Schrödinger Bridge Matching | Feb 5, 2024 | | CodeCode Available | 2 |
| Fuzz4All: Universal Fuzzing with Large Language Models | Aug 9, 2023 | | CodeCode Available | 2 |
| VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection | Aug 22, 2023 | Anomaly DetectionBinary Classification | CodeCode Available | 2 |
| ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction | Mar 30, 2024 | | CodeCode Available | 2 |
| How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States | Jun 9, 2024 | Safety Alignment | CodeCode Available | 2 |
| Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design | Oct 17, 2024 | Protein DesignReinforcement Learning (RL) | CodeCode Available | 2 |
| Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels | Dec 28, 2023 | Aesthetics Quality AssessmentImage Quality Assessment | CodeCode Available | 2 |
| TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts | Mar 5, 2024 | Graph AttentionGraph Embedding | CodeCode Available | 2 |
| The Chosen One: Consistent Characters in Text-to-Image Diffusion Models | Nov 16, 2023 | Consistent Character GenerationImage Generation | CodeCode Available | 2 |
| SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding | Aug 21, 2023 | Entity TypingEvent Extraction | CodeCode Available | 2 |
| MovieChat: From Dense Token to Sparse Memory for Long Video Understanding | Jul 31, 2023 | Multiple-choiceQuestion Answering | CodeCode Available | 2 |
| Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning | Jan 2, 2025 | ImputationRetrieval | CodeCode Available | 2 |
| AlignBench: Benchmarking Chinese Alignment of Large Language Models | Nov 30, 2023 | Benchmarking | CodeCode Available | 2 |
| SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing | Dec 20, 2023 | AttributeCross-Modal Retrieval | CodeCode Available | 2 |
| EMBER2024 -- A Benchmark Dataset for Holistic Evaluation of Malware Classifiers | Jun 5, 2025 | Malware AnalysisMalware Classification | CodeCode Available | 2 |
| Why are Visually-Grounded Language Models Bad at Image Classification? | May 28, 2024 | Classificationimage-classification | CodeCode Available | 2 |
| ICASSP 2023 Acoustic Echo Cancellation Challenge | Sep 22, 2023 | Acoustic echo cancellationSpeech Enhancement | CodeCode Available | 2 |
| Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning | Sep 11, 2024 | Large Language Model | CodeCode Available | 2 |
| ZERO-IG: Zero-Shot Illumination-Guided Joint Denoising and Adaptive Enhancement for Low-Light Images | Jan 1, 2024 | Denoising | CodeCode Available | 2 |
| Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning | Feb 14, 2025 | Reinforcement Learning (RL)Skills Assessment | CodeCode Available | 2 |
| Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation | May 10, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning | Mar 18, 2025 | Autonomous DrivingMotion Planning | CodeCode Available | 2 |
| ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination | Oct 8, 2023 | DiversityMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| BrainMorph: A Foundational Keypoint Model for Robust and Flexible Brain MRI Registration | May 22, 2024 | | CodeCode Available | 2 |
| Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving | Mar 28, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Pose for Everything: Towards Category-Agnostic Pose Estimation | Jul 21, 2022 | 2D Pose EstimationCategory-Agnostic Pose Estimation | CodeCode Available | 2 |
| Neural Optimal Transport | Jan 28, 2022 | Image-to-Image TranslationTranslation | CodeCode Available | 2 |
| Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations | Jul 17, 2024 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| Singer Identity Representation Learning using Self-Supervised Techniques | Jan 10, 2024 | Domain GeneralizationRepresentation Learning | CodeCode Available | 2 |
| What does a platypus look like? Generating customized prompts for zero-shot image classification | Sep 7, 2022 | Descriptiveimage-classification | CodeCode Available | 2 |
| Towards Zero-shot Point Cloud Anomaly Detection: A Multi-View Projection Framework | Sep 20, 2024 | Anomaly DetectionSpecificity | CodeCode Available | 2 |
| Skeleton-free Pose Transfer for Stylized 3D Characters | Jul 28, 2022 | Pose Transfer | CodeCode Available | 2 |
| Ambiguous Medical Image Segmentation using Diffusion Models | Apr 10, 2023 | DiagnosticDiversity | CodeCode Available | 2 |
| VQA^2: Visual Question Answering for Video Quality Assessment | Nov 6, 2024 | Question AnsweringVideo Quality Assessment | CodeCode Available | 2 |
| PosterLlama: Bridging Design Ability of Langauge Model to Contents-Aware Layout Generation | Apr 1, 2024 | Layout DesignLayout Generation | CodeCode Available | 2 |
| Class-Incremental Learning: A Survey | Feb 7, 2023 | class-incremental learningClass Incremental Learning | CodeCode Available | 2 |