| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 |
| MakeItTalk: Speaker-Aware Talking-Head Animation | Apr 27, 2020 | Talking Face GenerationTalking Head Generation | CodeCode Available | 2 |
| Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR | Mar 13, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion | Mar 21, 2023 | Optical Flow EstimationScene Flow Estimation | CodeCode Available | 2 |
| CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control | May 29, 2024 | RAGResponse Generation | CodeCode Available | 2 |
| Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges | Apr 24, 2024 | Drug DesignInductive Bias | CodeCode Available | 2 |
| Scalable Zero-shot Entity Linking with Dense Entity Retrieval | Nov 10, 2019 | Entity EmbeddingsEntity Linking | CodeCode Available | 2 |
| MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection | Mar 24, 2022 | 3D Object Detection3D Object Detection From Monocular Images | CodeCode Available | 2 |
| RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent | Jun 11, 2024 | AI AgentDescriptive | CodeCode Available | 2 |
| RhythmFormer: Extracting Patterned rPPG Signals based on Periodic Sparse Attention | Feb 20, 2024 | | CodeCode Available | 2 |
| Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future | Sep 27, 2023 | Navigate | CodeCode Available | 2 |
| Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching | Jun 3, 2024 | Denoising | CodeCode Available | 2 |
| MiVOLO: Multi-input Transformer for Age and Gender Estimation | Jul 10, 2023 | Age And Gender ClassificationAge and Gender Estimation | CodeCode Available | 2 |
| Graph Neural Networks in Supply Chain Analytics and Optimization: Concepts, Perspectives, Dataset and Benchmarks | Nov 13, 2024 | Anomaly DetectionDemand Forecasting | CodeCode Available | 2 |
| Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation | Dec 5, 2024 | Semantic SegmentationTime Series | CodeCode Available | 2 |
| RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild | Jun 1, 2020 | 3D Face ReconstructionFace Alignment | CodeCode Available | 2 |
| RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining | Jul 31, 2024 | Optical Flow EstimationRain Removal | CodeCode Available | 2 |
| Mergenetic: a Simple Evolutionary Model Merging Library | May 16, 2025 | Evolutionary Algorithmsmodel | CodeCode Available | 2 |
| Augmented Object Intelligence with XR-Objects | Apr 20, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| TART: A plug-and-play Transformer module for task-agnostic reasoning | Sep 21, 2023 | | CodeCode Available | 2 |
| Hulk: A Universal Knowledge Translator for Human-Centric Tasks | Dec 4, 2023 | 3D Human Pose EstimationAction Recognition | CodeCode Available | 2 |
| Face Swap via Diffusion Model | Mar 2, 2024 | Face AlignmentFace Detection | CodeCode Available | 2 |
| Segment anything model 2: an application to 2D and 3D medical images | Aug 1, 2024 | Computed Tomography (CT)Segmentation | CodeCode Available | 2 |
| Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Sep 1, 2023 | 3D Generation3D Question Answering (3D-QA) | CodeCode Available | 2 |
| A Comparative Study on Reasoning Patterns of OpenAI's o1 Model | Oct 17, 2024 | Math | CodeCode Available | 2 |
| RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars | May 22, 2023 | 2kImage Matting | CodeCode Available | 2 |
| Query-Dependent Video Representation for Moment Retrieval and Highlight Detection | Mar 24, 2023 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 |
| SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking | Nov 16, 2022 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields | Dec 5, 2022 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 2 |
| Augraphy: A Data Augmentation Library for Document Images | Aug 30, 2022 | Data AugmentationDenoising | CodeCode Available | 2 |
| EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models | Feb 1, 2024 | | CodeCode Available | 2 |
| Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects | Apr 1, 2024 | Articulated Object modelling | CodeCode Available | 2 |
| Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation | Apr 1, 2024 | Denoising | CodeCode Available | 2 |
| MotionChain: Conversational Motion Controllers via Multimodal Prompts | Apr 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Behavior Trees Enable Structured Programming of Language Model Agents | Apr 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Counterfactual Learning on Graphs: A Survey | Apr 3, 2023 | counterfactualFairness | CodeCode Available | 2 |
| HILCodec: High-Fidelity and Lightweight Neural Audio Codec | May 8, 2024 | | CodeCode Available | 2 |
| Full Page Handwriting Recognition via Image to Sequence Extraction | Mar 11, 2021 | Handwriting RecognitionHandwritten Text Recognition | CodeCode Available | 2 |
| F-LMM: Grounding Frozen Large Multimodal Models | Jun 9, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| DreamDiffusion: Generating High-Quality Images from Brain EEG Signals | Jun 29, 2023 | EEGElectroencephalogram (EEG) | CodeCode Available | 2 |
| Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation | Nov 7, 2024 | Data AugmentationSynthetic Data Generation | CodeCode Available | 2 |
| moolib: A Platform for Distributed RL | Jan 26, 2022 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts | Jul 24, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation | Jul 15, 2024 | Information RetrievalKnowledge Graphs | CodeCode Available | 2 |
| DiffLoc: Diffusion Model for Outdoor LiDAR Localization | Jan 1, 2024 | Denoisingmodel | CodeCode Available | 2 |
| PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation | May 4, 2024 | In-Context LearningRetrieval | CodeCode Available | 2 |
| SSL: A Self-similarity Loss for Improving Generative Image Super-resolution | Aug 11, 2024 | HallucinationImage Super-Resolution | CodeCode Available | 2 |
| SOLO: A Single Transformer for Scalable Vision-Language Modeling | Jul 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images | Aug 21, 2024 | MambaSegmentation | CodeCode Available | 2 |
| LLaSM: Large Language and Speech Model | Aug 30, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |