| Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models | Jul 17, 2024 | BenchmarkingRed Teaming | CodeCode Available | 2 | 5 |
| Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 | Mar 31, 2025 | Logical ReasoningMultiple-choice | CodeCode Available | 2 | 5 |
| Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ | Jul 1, 2017 | | CodeCode Available | 2 | 5 |
| ScaleKD: Strong Vision Transformers Could Be Excellent Teachers | Nov 11, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| CheXpert Plus: Augmenting a Large Chest X-ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image Formats | May 29, 2024 | De-identificationFairness | CodeCode Available | 2 | 5 |
| NeRF-RPN: A general framework for object detection in NeRFs | Nov 21, 2022 | NeRFobject-detection | CodeCode Available | 2 | 5 |
| HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models | Oct 23, 2023 | DiagnosticHallucination | CodeCode Available | 2 | 5 |
| Automatic Differentiation-based Full Waveform Inversion with Flexible Workflows | Nov 30, 2024 | Dynamic Time Warping | CodeCode Available | 2 | 5 |
| AirMorph: Topology-Preserving Deep Learning for Pulmonary Airway Analysis | Dec 15, 2024 | AnatomyDeep Learning | CodeCode Available | 2 | 5 |
| Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey | Feb 14, 2024 | Survey | CodeCode Available | 2 | 5 |
| SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding | May 22, 2025 | Motion EstimationQuestion Answering | CodeCode Available | 2 | 5 |
| An Empirical Study of Qwen3 Quantization | May 4, 2025 | Natural Language UnderstandingQuantization | CodeCode Available | 2 | 5 |
| One-shot Entropy Minimization | May 26, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 | 5 |
| KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation | Mar 31, 2024 | 3D Human Pose EstimationMonocular 3D Human Pose Estimation | CodeCode Available | 2 | 5 |
| BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks | May 26, 2023 | Image CaptioningMedical Visual Question Answering | CodeCode Available | 2 | 5 |
| Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets | May 17, 2025 | | CodeCode Available | 2 | 5 |
| VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling | Jun 6, 2024 | DiversityMusic Generation | CodeCode Available | 2 | 5 |
| ColorizeDiffusion v2: Enhancing Reference-based Sketch Colorization Through Separating Utilities | Apr 9, 2025 | ColorizationSketch Colorization | CodeCode Available | 2 | 5 |
| ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL | May 30, 2025 | Image GenerationLanguage Modeling | CodeCode Available | 2 | 5 |
| nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark | Jan 1, 2025 | BenchmarkingImage Segmentation | CodeCode Available | 2 | 5 |
| X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation | Mar 8, 2025 | GPUImage Generation | CodeCode Available | 2 | 5 |
| A Transformer-Based Siamese Network for Change Detection | Jan 4, 2022 | Change DetectionDecoder | CodeCode Available | 2 | 5 |
| Focal Modulation Networks | Mar 22, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| An Embodied Generalist Agent in 3D World | Nov 18, 2023 | 3D dense captioning3D Question Answering (3D-QA) | CodeCode Available | 2 | 5 |
| Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV | Mar 3, 2024 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 2 | 5 |