| DepthMaster: Taming Diffusion Models for Monocular Depth Estimation | Jan 5, 2025 | DenoisingDepth Estimation | CodeCode Available | 2 |
| Efficient Autoregressive Audio Modeling via Next-Scale Prediction | Aug 16, 2024 | Audio GenerationFAD | CodeCode Available | 2 |
| Accelerating DETR Convergence via Semantic-Aligned Matching | Mar 14, 2022 | Objectobject-detection | CodeCode Available | 2 |
| SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models | Mar 4, 2022 | Contrastive LearningGraph Embedding | CodeCode Available | 2 |
| Make-A-Shape: a Ten-Million-scale 3D Shape Model | Jan 20, 2024 | | CodeCode Available | 2 |
| On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection | Oct 31, 2024 | Video Forensics | CodeCode Available | 2 |
| The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG) | Feb 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Adversarial Detection and Correction by Matching Prediction Distributions | Feb 21, 2020 | Prediction | CodeCode Available | 2 |
| ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation | Jul 2, 2024 | PredictionText to 3D | CodeCode Available | 2 |
| ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation | Feb 20, 2025 | 3D Molecule GenerationProtein Design | CodeCode Available | 2 |
| Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching | Mar 1, 2024 | Stereo Matching | CodeCode Available | 2 |
| Zero Shot Health Trajectory Prediction Using Transformer | Jul 30, 2024 | ICU AdmissionICU Mortality | CodeCode Available | 2 |
| Helix-mRNA: A Hybrid Foundation Model For Full Sequence mRNA Therapeutics | Feb 19, 2025 | | CodeCode Available | 2 |
| X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios | Nov 2, 2024 | Denoising | CodeCode Available | 2 |
| MHG-GNN: Combination of Molecular Hypergraph Grammar with Graph Neural Network | Sep 28, 2023 | Graph Neural NetworkPrediction | CodeCode Available | 2 |
| DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting | Apr 25, 2024 | Exemplar-Free CountingFew-shot Object Counting and Detection | CodeCode Available | 2 |
| Learning-Based Defect Recognitions for Autonomous UAV Inspections | Feb 13, 2023 | Crack SegmentationSegmentation | CodeCode Available | 2 |
| InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences | Dec 2, 2024 | | CodeCode Available | 2 |
| LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection | Mar 26, 2024 | Image Generation | CodeCode Available | 2 |
| Deep Learning-Based Point Cloud Registration: A Comprehensive Survey and Taxonomy | Apr 22, 2024 | Autonomous DrivingDeep Learning | CodeCode Available | 2 |
| Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains | May 24, 2025 | Computational EfficiencyOperator learning | CodeCode Available | 2 |
| Poison-splat: Computation Cost Attack on 3D Gaussian Splatting | Oct 10, 2024 | 3DGS | CodeCode Available | 2 |
| Enhancing Taiwanese Hokkien Dual Translation by Exploring and Standardizing of Four Writing Systems | Mar 18, 2024 | Machine TranslationTranslation | CodeCode Available | 2 |
| A Survey on Diffusion Models for Recommender Systems | Sep 8, 2024 | Data AugmentationRecommendation Systems | CodeCode Available | 2 |
| Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology | Jun 3, 2025 | Multiple Instance LearningPrognosis | CodeCode Available | 2 |
| CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations | Apr 10, 2024 | Dialogue Generationtext-to-speech | CodeCode Available | 2 |
| Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior | Mar 29, 2024 | NeRF | CodeCode Available | 2 |
| Panda: A pretrained forecast model for universal representation of chaotic dynamics | May 19, 2025 | Time Series | CodeCode Available | 2 |
| Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout | Apr 26, 2024 | | CodeCode Available | 2 |
| InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval | Jan 4, 2023 | Information RetrievalRetrieval | CodeCode Available | 2 |
| Point Transformer V2: Grouped Vector Attention and Partition-based Pooling | Oct 11, 2022 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 2 |
| Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks | Jan 30, 2024 | | CodeCode Available | 2 |
| Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking | Apr 12, 2024 | Contrastive LearningRetrieval | CodeCode Available | 2 |
| Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 | Dec 26, 2023 | All | CodeCode Available | 2 |
| Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models | Jun 1, 2023 | Image GenerationStory Visualization | CodeCode Available | 2 |
| LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts | Dec 16, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| Deconstructing equivariant representations in molecular systems | Oct 10, 2024 | Property Prediction | CodeCode Available | 2 |
| GPT Understands, Too | Mar 18, 2021 | Knowledge ProbingLanguage Modeling | CodeCode Available | 2 |
| TransTab: Learning Transferable Tabular Transformers Across Tables | May 19, 2022 | Incremental LearningTransfer Learning | CodeCode Available | 2 |
| VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting | Mar 25, 2024 | Mamba | CodeCode Available | 2 |
| Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models | Jun 5, 2024 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models | Oct 15, 2024 | | CodeCode Available | 2 |
| GPD-1: Generative Pre-training for Driving | Dec 11, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| Attacking and Defending Machine Learning Applications of Public Cloud | Jul 27, 2020 | Adversarial AttackBIG-bench Machine Learning | CodeCode Available | 2 |
| ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model | Apr 13, 2025 | DiagnosticLanguage Modeling | CodeCode Available | 2 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings | Jul 28, 2020 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 2 |
| Dense Connector for MLLMs | May 22, 2024 | Video Understanding | CodeCode Available | 2 |
| Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration | May 30, 2023 | Drug Design | CodeCode Available | 2 |
| Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives | Nov 7, 2024 | Large Language Model | CodeCode Available | 2 |