| SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video | Jan 30, 2022 | 3D Human ReconstructionNeural Rendering | CodeCode Available | 2 | 5 |
| LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation | Nov 14, 2024 | Earth ObservationInstruction Following | CodeCode Available | 2 | 5 |
| FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning | Aug 13, 2021 | Federated Learning | CodeCode Available | 2 | 5 |
| LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications | Mar 4, 2025 | Action Generation | CodeCode Available | 2 | 5 |
| Unveiling COVID-19 from Chest X-ray with deep learning: a hurdles race with small data | Apr 11, 2020 | Small Data Image ClassificationTransfer Learning | CodeCode Available | 2 | 5 |
| DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification | Jul 4, 2024 | DescriptiveDiversity | CodeCode Available | 2 | 5 |
| CoIR: A Comprehensive Benchmark for Code Information Retrieval Models | Jul 3, 2024 | BenchmarkingCode Search | CodeCode Available | 2 | 5 |
| ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations | Jul 30, 2021 | | CodeCode Available | 2 | 5 |
| BioCLIP: A Vision Foundation Model for the Tree of Life | Nov 30, 2023 | | CodeCode Available | 2 | 5 |
| VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module | Apr 7, 2024 | Image Registration | CodeCode Available | 2 | 5 |
| Fusing finetuned models for better pretraining | Apr 6, 2022 | | CodeCode Available | 2 | 5 |
| Flow Matching in Latent Space | Jul 17, 2023 | Computational EfficiencyImage Generation | CodeCode Available | 2 | 5 |
| Evaluating Explainability for Graph Neural Networks | Aug 19, 2022 | | CodeCode Available | 2 | 5 |
| Efficient Quality Diversity Optimization of 3D Buildings through 2D Pre-optimization | Mar 28, 2023 | Diversity | CodeCode Available | 2 | 5 |
| Certified Human Trajectory Prediction | Mar 20, 2024 | Autonomous VehiclesPrediction | CodeCode Available | 2 | 5 |
| No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance | Apr 4, 2024 | BenchmarkingImage Generation | CodeCode Available | 2 | 5 |
| LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations | Oct 3, 2024 | | CodeCode Available | 2 | 5 |
| Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era | Sep 3, 2024 | Scene UnderstandingShadow Detection | CodeCode Available | 2 | 5 |
| MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification | Jan 9, 2025 | ClassificationHyperspectral Image Classification | CodeCode Available | 2 | 5 |
| Provable Robust Watermarking for AI-Generated Text | Jun 30, 2023 | Language Modelling | CodeCode Available | 2 | 5 |
| Large Language Models are Efficient Learners of Noise-Robust Speech Recognition | Jan 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation | May 16, 2024 | | CodeCode Available | 2 | 5 |
| ToolGen: Unified Tool Retrieval and Calling via Generation | Oct 4, 2024 | RetrievalText Generation | CodeCode Available | 2 | 5 |
| MoCha-Stereo: Motif Channel Attention Network for Stereo Matching | Apr 10, 2024 | Disparity EstimationStereo Depth Estimation | CodeCode Available | 2 | 5 |
| Equivariant Energy-Guided SDE for Inverse Molecular Design | Sep 30, 2022 | 3D Molecule GenerationDrug Discovery | CodeCode Available | 2 | 5 |
| LinVT: Empower Your Image-level Large Language Model to Understand Videos | Dec 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics | Dec 12, 2023 | Information RetrievalRepresentation Learning | CodeCode Available | 2 | 5 |
| Blue noise for diffusion models | Feb 7, 2024 | Denoising | CodeCode Available | 2 | 5 |
| Recurrent Memory Transformer | Jul 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Artificial Kuramoto Oscillatory Neurons | Oct 17, 2024 | Adversarial RobustnessObject Discovery | CodeCode Available | 2 | 5 |
| AvatarGen: A 3D Generative Model for Animatable Human Avatars | Nov 26, 2022 | Human Animation | CodeCode Available | 2 | 5 |
| SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas | Mar 18, 2025 | Multi-agent Reinforcement Learningreinforcement-learning | CodeCode Available | 2 | 5 |
| Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation | Nov 23, 2022 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 2 | 5 |
| SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning | May 16, 2025 | Contrastive Learning | CodeCode Available | 2 | 5 |
| Self-Supervised Multimodal Learning: A Survey | Mar 31, 2023 | Machine TranslationSelf-Supervised Learning | CodeCode Available | 2 | 5 |
| Unified Multimodal Discrete Diffusion | Mar 26, 2025 | Image CaptioningImage Generation | CodeCode Available | 2 | 5 |
| RepairAgent: An Autonomous, LLM-Based Agent for Program Repair | Mar 25, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 | 5 |
| RANSAC Back to SOTA: A Two-stage Consensus Filtering for Real-time 3D Registration | Oct 21, 2024 | Point Cloud Registration | CodeCode Available | 2 | 5 |
| Accurate 3D Body Shape Regression using Metric and Semantic Attributes | Jun 14, 2022 | 3D Human Reconstruction3D Human Shape Estimation | CodeCode Available | 2 | 5 |
| Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models | Feb 19, 2024 | | CodeCode Available | 2 | 5 |
| ExpertPrompting: Instructing Large Language Models to be Distinguished Experts | May 24, 2023 | In-Context LearningInstruction Following | CodeCode Available | 2 | 5 |
| Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More | Feb 17, 2025 | | CodeCode Available | 2 | 5 |
| RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models | Nov 16, 2022 | Dimensionality ReductionInformation Retrieval | CodeCode Available | 2 | 5 |
| AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception | Apr 15, 2024 | | CodeCode Available | 2 | 5 |
| Multimodal Analogical Reasoning over Knowledge Graphs | Oct 1, 2022 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 2 | 5 |
| RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations | Feb 27, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 | 5 |
| SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension | Jul 30, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 2 | 5 |
| PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference | May 23, 2024 | | CodeCode Available | 2 | 5 |
| Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders | Jun 20, 2022 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 | 5 |
| ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT | Apr 27, 2020 | Document RankingInformation Retrieval | CodeCode Available | 2 | 5 |