| DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation | Mar 24, 2025 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |
| Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning | May 31, 2023 | Decision MakingGeneral Knowledge | CodeCode Available | 2 |
| LLM-PySC2: Starcraft II learning environment for Large Language Models | Nov 8, 2024 | Decision MakingLanguage Modelling | CodeCode Available | 2 |
| Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment | Dec 26, 2024 | | CodeCode Available | 2 |
| Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation | Jul 30, 2024 | DisentanglementMusic Generation | CodeCode Available | 2 |
| DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision | Dec 26, 2023 | Deep LearningNeRF | CodeCode Available | 2 |
| Gaussian Shell Maps for Efficient 3D Human Generation | Nov 29, 2023 | | CodeCode Available | 2 |
| Holodeck: Language Guided Generation of 3D Embodied AI Environments | Dec 14, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 |
| CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | May 9, 2024 | Image CaptioningInstruction Following | CodeCode Available | 2 |
| Parameter-Efficient Fine-Tuning for Foundation Models | Jan 23, 2025 | parameter-efficient fine-tuningSurvey | CodeCode Available | 2 |
| LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking | Apr 10, 2025 | RerankingRetrieval-augmented Generation | CodeCode Available | 2 |
| End-to-end Learnable Clustering for Intent Learning in Recommendation | Jan 11, 2024 | ClusteringContrastive Learning | CodeCode Available | 2 |
| Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding | Nov 28, 2023 | HallucinationObject | CodeCode Available | 2 |
| Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask | May 9, 2024 | Anomaly DetectionImputation | CodeCode Available | 2 |
| Process Reward Models That Think | Apr 23, 2025 | Math | CodeCode Available | 2 |
| Demonstration of Robust and Efficient Quantum Property Learning with Shallow Shadows | Feb 27, 2024 | Bayesian Inference | CodeCode Available | 2 |
| Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows | Jun 25, 2024 | Bayesian Inference | CodeCode Available | 2 |
| Comprehending and Ordering Semantics for Image Captioning | Jun 14, 2022 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 2 |
| Guide to k-mer approaches for genomics across the tree of life | Apr 1, 2024 | Diversity | CodeCode Available | 2 |
| RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design | Jun 19, 2024 | Diversity | CodeCode Available | 2 |
| DINO-Foresight: Looking into the Future with DINO | Dec 16, 2024 | Autonomous DrivingScene Understanding | CodeCode Available | 2 |
| Healthsheet: Development of a Transparency Artifact for Health Datasets | Feb 26, 2022 | Diagnostic | CodeCode Available | 2 |
| Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition | Jun 20, 2024 | DiagnosticEEG | CodeCode Available | 2 |
| The Equalization Losses: Gradient-Driven Training for Long-tailed Object Recognition | Oct 11, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| Deep Bidirectional Language-Knowledge Graph Pretraining | Oct 17, 2022 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 2 |
| Model-Based Imitation Learning for Urban Driving | Oct 14, 2022 | 3D geometryAutonomous Driving | CodeCode Available | 2 |
| EEG2Rep: Enhancing Self-supervised EEG Representation Through Informative Masked Inputs | Feb 17, 2024 | EEGEEG Signal Classification | CodeCode Available | 2 |
| Foundations and Recent Trends in Multimodal Mobile Agents: A Survey | Nov 4, 2024 | multimodal interactionSurvey | CodeCode Available | 2 |
| Training Deep AutoEncoders for Collaborative Filtering | Aug 5, 2017 | Collaborative FilteringRecommendation Systems | CodeCode Available | 2 |
| Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation | Nov 22, 2022 | Image GenerationImage-to-Image Translation | CodeCode Available | 2 |
| S^2IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting | Mar 9, 2024 | Prompt LearningTime Series | CodeCode Available | 2 |
| Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection | Mar 27, 2024 | Data AugmentationDomain Adaptation | CodeCode Available | 2 |
| T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation | Apr 1, 2024 | Image SegmentationMamba | CodeCode Available | 2 |
| StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles | Jan 3, 2023 | DecoderFace Generation | CodeCode Available | 2 |
| When Spiking neural networks meet temporal attention image decoding and adaptive spiking neuron | Jun 5, 2024 | | CodeCode Available | 2 |
| Benchmarking Benchmark Leakage in Large Language Models | Apr 29, 2024 | BenchmarkingMathematical Reasoning | CodeCode Available | 2 |
| K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions | Jun 16, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Kinetics: Rethinking Test-Time Scaling Laws | Jun 5, 2025 | | CodeCode Available | 2 |
| Spatial-Semantic Collaborative Cropping for User Generated Content | Jan 16, 2024 | Image Cropping | CodeCode Available | 2 |
| Automatic and Universal Prompt Injection Attacks against Large Language Models | Mar 7, 2024 | | CodeCode Available | 2 |
| The GigaMIDI Dataset with Features for Expressive Music Performance Detection | Feb 24, 2025 | Information RetrievalMusic Information Retrieval | CodeCode Available | 2 |
| Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration | Feb 17, 2025 | | CodeCode Available | 2 |
| ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning | Feb 11, 2024 | LLM real-life tasksOpen-Domain Question Answering | CodeCode Available | 2 |
| OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction | Apr 11, 2023 | 3D Semantic Occupancy Prediction3D Semantic Scene Completion | CodeCode Available | 2 |
| Atlas: Few-shot Learning with Retrieval Augmented Language Models | Aug 5, 2022 | Fact CheckingFew-Shot Learning | CodeCode Available | 2 |
| PokerKit: A Comprehensive Python Library for Fine-Grained Multi-Variant Poker Game Simulations | Aug 8, 2023 | | CodeCode Available | 2 |
| BWT construction and search at the terabase scale | Sep 1, 2024 | Data Compression | CodeCode Available | 2 |
| DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding | Apr 21, 2025 | Hallucination | CodeCode Available | 2 |
| TorchGeo: Deep Learning With Geospatial Data | Nov 17, 2021 | Deep LearningTransfer Learning | CodeCode Available | 2 |
| Transductive Active Learning: Theory and Applications | Feb 13, 2024 | Active LearningBayesian Optimization | CodeCode Available | 2 |