| HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending | Oct 16, 2023 | Attribute | CodeCode Available | 2 |
| Turning a CLIP Model into a Scene Text Detector | Feb 28, 2023 | Domain AdaptationScene Text Detection | CodeCode Available | 2 |
| MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation | Mar 1, 2023 | Audio-Visual Speech RecognitionRobust Speech Recognition | CodeCode Available | 2 |
| Moving Object Segmentation in Point Cloud Data using Hidden Markov Models | Oct 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| ChangeViT: Unleashing Plain Vision Transformers for Change Detection | Jun 18, 2024 | Change Detection | CodeCode Available | 2 |
| Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation | Apr 26, 2023 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| TroL: Traversal of Layers for Large Language and Vision Models | Jun 18, 2024 | Visual Question Answering | CodeCode Available | 2 |
| One-for-More: Continual Diffusion Model for Anomaly Detection | Feb 27, 2025 | Anomaly Detectioncontinual anomaly detection | CodeCode Available | 2 |
| GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion | Feb 22, 2024 | Denoising | CodeCode Available | 2 |
| EmoBench: Evaluating the Emotional Intelligence of Large Language Models | Feb 19, 2024 | Emotional IntelligenceEmotion Recognition | CodeCode Available | 2 |
| Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images | Jan 4, 2022 | 3D Semantic SegmentationBrain Tumor Segmentation | CodeCode Available | 2 |
| Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction | Mar 31, 2024 | Motion GenerationObject | CodeCode Available | 2 |
| DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References | Feb 13, 2025 | Human-Object Interaction DetectionImitation Learning | CodeCode Available | 2 |
| ICDAR 2021 Competition on Scientific Literature Parsing | Jun 8, 2021 | document understandingobject-detection | CodeCode Available | 2 |
| Scale-invariant Learning by Physics Inversion | Sep 30, 2021 | BIG-bench Machine Learningparameter estimation | CodeCode Available | 2 |
| Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences | Mar 8, 2022 | TripletWeakly-supervised Learning | CodeCode Available | 2 |
| Differentiable Voxelization and Mesh Morphing | Jul 15, 2024 | GPU | CodeCode Available | 2 |
| MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading | Jun 20, 2024 | Algorithmic TradingDecision Making | CodeCode Available | 2 |
| Visual Text Processing: A Comprehensive Review and Unified Evaluation | Apr 30, 2025 | Image ManipulationImage Reconstruction | CodeCode Available | 2 |
| QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension | Mar 11, 2025 | AutoMLDecoder | CodeCode Available | 2 |
| Transformers learn in-context by gradient descent | Dec 15, 2022 | In-Context LearningMeta-Learning | CodeCode Available | 2 |
| Meent: Differentiable Electromagnetic Simulator for Machine Learning | Jun 11, 2024 | | CodeCode Available | 2 |
| Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding | Nov 14, 2023 | Image-based Generative Performance BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention | Jan 12, 2024 | Super-ResolutionVideo Super-Resolution | CodeCode Available | 2 |
| AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection | May 15, 2025 | Anomaly Detection | CodeCode Available | 2 |
| 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks | Apr 18, 2019 | 3D Semantic Segmentation4D Spatio Temporal Semantic Segmentation | CodeCode Available | 2 |
| Plug and Play Language Models: A Simple Approach to Controlled Text Generation | Dec 4, 2019 | AttributeLanguage Modelling | CodeCode Available | 2 |
| Third Time's the Charm? Image and Video Editing with StyleGAN3 | Jan 31, 2022 | DisentanglementImage Generation | CodeCode Available | 2 |
| PPI++: Efficient Prediction-Powered Inference | Nov 2, 2023 | Prediction | CodeCode Available | 2 |
| Adapting Segment Anything Model for Change Detection in HR Remote Sensing Images | Sep 4, 2023 | Change DetectionInteractive Segmentation | CodeCode Available | 2 |
| Black-Box Tuning for Language-Model-as-a-Service | Jan 10, 2022 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| Managing FAIR Knowledge Graphs as Polyglot Data End Points: A Benchmark based on the rdf2pg Framework and Plant Biology Data | May 23, 2025 | Knowledge GraphsManagement | CodeCode Available | 2 |
| 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination | Jun 7, 2024 | Hallucination | CodeCode Available | 2 |
| Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction | Dec 7, 2022 | AttributePrediction | CodeCode Available | 2 |
| Contrastive Decoding: Open-ended Text Generation as Optimization | Oct 27, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting | Nov 19, 2022 | DecoderScene Text Detection | CodeCode Available | 2 |
| MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning | Apr 18, 2023 | Emotion RecognitionMulti-Label Learning | CodeCode Available | 2 |
| DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation | Jun 24, 2024 | BenchmarkingImage Generation | CodeCode Available | 2 |
| FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation | Oct 5, 2023 | HallucinationWorld Knowledge | CodeCode Available | 2 |
| EpiLearn: A Python Library for Machine Learning in Epidemic Modeling | Jun 10, 2024 | | CodeCode Available | 2 |
| 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods | Feb 18, 2024 | Data CompressionImage Compression | CodeCode Available | 2 |
| It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design | Oct 15, 2024 | Drug Discoveryreinforcement-learning | CodeCode Available | 2 |
| FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation | Apr 19, 2024 | DecoderNetwork Embedding | CodeCode Available | 2 |
| NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation | May 15, 2023 | 3D human pose and shape estimation3D Human Pose Estimation | CodeCode Available | 2 |
| Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding | Jan 24, 2025 | AnatomyContrastive Learning | CodeCode Available | 2 |
| FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Jun 24, 2024 | Video Generation | CodeCode Available | 2 |
| DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation | Jun 23, 2024 | 3D Lane DetectionAutonomous Driving | CodeCode Available | 2 |
| MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning | Jun 25, 2024 | ObjectObject Recognition | CodeCode Available | 2 |
| Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models | Jun 25, 2024 | DiversityMath | CodeCode Available | 2 |
| Scattered Mixture-of-Experts Implementation | Mar 13, 2024 | Mixture-of-Experts | CodeCode Available | 2 |