| Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations | May 3, 2024 | Optical Flow EstimationReference-based Super-Resolution | CodeCode Available | 2 |
| MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning | Nov 4, 2023 | Multi-Task Learning | CodeCode Available | 2 |
| When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Feb 16, 2024 | Mathematical ReasoningRe-Ranking | CodeCode Available | 2 |
| ScreenAI: A Vision-Language Model for UI and Infographics Understanding | Feb 7, 2024 | Chart Question AnsweringLanguage Modeling | CodeCode Available | 2 |
| Learning to Prompt for Vision-Language Models | Sep 2, 2021 | Domain GeneralizationFew-shot Age Estimation | CodeCode Available | 2 |
| EmoFace: Audio-driven Emotional 3D Face Animation | Jul 17, 2024 | 3D Face Animation | CodeCode Available | 2 |
| OmniBench: Towards The Future of Universal Omni-Language Models | Sep 23, 2024 | Instruction Following | CodeCode Available | 2 |
| ADATIME: A Benchmarking Suite for Domain Adaptation on Time Series Data | Mar 15, 2022 | BenchmarkingDomain Adaptation | CodeCode Available | 2 |
| ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction | Jul 9, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| InteractRank: Personalized Web-Scale Search Pre-Ranking with Cross Interaction Features | Apr 9, 2025 | Computational Efficiency | CodeCode Available | 2 |
| Specializing Smaller Language Models towards Multi-Step Reasoning | Jan 30, 2023 | MathModel Selection | CodeCode Available | 2 |
| Stitchable Neural Networks | Feb 13, 2023 | Image Classification | CodeCode Available | 2 |
| Respecting causality is all you need for training physics-informed neural networks | Mar 14, 2022 | AllAttribute | CodeCode Available | 2 |
| Towards Interpretable Mental Health Analysis with Large Language Models | Apr 6, 2023 | Causal Emotion EntailmentEmotion Recognition | CodeCode Available | 2 |
| Cross-Modality Safety Alignment | Jun 21, 2024 | Safety Alignment | CodeCode Available | 2 |
| FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization | Apr 21, 2024 | Anomaly DetectionPosition | CodeCode Available | 2 |
| HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference | Apr 8, 2025 | CPUGPU | CodeCode Available | 2 |
| Target conversation extraction: Source separation using turn-taking dynamics | Jul 15, 2024 | | CodeCode Available | 2 |
| Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts | Mar 14, 2024 | DenoisingMixture-of-Experts | CodeCode Available | 2 |
| GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models | Sep 6, 2023 | | CodeCode Available | 2 |
| BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions | Aug 19, 2023 | MMEOptical Character Recognition (OCR) | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| normflows: A PyTorch Package for Normalizing Flows | Jan 26, 2023 | Image GenerationVariational Inference | CodeCode Available | 2 |
| WidthFormer: Toward Efficient Transformer-based BEV View Transformation | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System | Jun 27, 2023 | | CodeCode Available | 2 |
| Deep Incubation: Training Large Models by Divide-and-Conquering | Dec 8, 2022 | Image Segmentationobject-detection | CodeCode Available | 2 |
| Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models | Mar 18, 2025 | AnatomyAttribute | CodeCode Available | 2 |
| Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark | May 14, 2024 | | CodeCode Available | 2 |
| MARLIN: Masked Autoencoder for facial video Representation LearnINg | Nov 12, 2022 | Action ClassificationAttribute | CodeCode Available | 2 |
| GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization | Sep 27, 2023 | Contrastive Learninggeo-localization | CodeCode Available | 2 |
| Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey | Sep 3, 2024 | Out-of-Distribution Detection | CodeCode Available | 2 |
| StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams | Jun 10, 2025 | 3DGS3D Reconstruction | CodeCode Available | 2 |
| eVAE: Evolutionary Variational Autoencoder | Jan 1, 2023 | DisentanglementImage Generation | CodeCode Available | 2 |
| Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation | Jun 20, 2025 | Scene Generation | CodeCode Available | 2 |
| EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision | Nov 3, 2023 | Optical Flow EstimationSemantic Segmentation | CodeCode Available | 2 |
| Omni-Video: Democratizing Unified Video Understanding and Generation | Jul 8, 2025 | Video GenerationVideo Understanding | CodeCode Available | 2 |
| From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking | Jun 24, 2024 | BenchmarkingNeRF | CodeCode Available | 2 |
| Unwrapping The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification | Nov 8, 2020 | | CodeCode Available | 2 |
| VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning | Feb 17, 2022 | Deep Reinforcement LearningOffline RL | CodeCode Available | 2 |
| A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark | Feb 28, 2022 | Image SegmentationInductive Bias | CodeCode Available | 2 |
| Neural interval-censored survival regression with feature selection | Jun 14, 2022 | feature selectionregression | CodeCode Available | 2 |
| DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation | May 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models | Nov 28, 2022 | DenoisingLanguage Modeling | CodeCode Available | 2 |
| Executing your Commands via Motion Diffusion in Latent Space | Dec 8, 2022 | Motion GenerationMotion Synthesis | CodeCode Available | 2 |
| NMS Strikes Back | Dec 12, 2022 | Attributeobject-detection | CodeCode Available | 2 |
| DiffFace: Diffusion-based Face Swapping with Facial Guidance | Dec 27, 2022 | Face Swapping | CodeCode Available | 2 |
| Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability | Apr 13, 2025 | model | CodeCode Available | 2 |
| Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders | Jun 13, 2025 | Speech Enhancement | CodeCode Available | 2 |
| Watermarking Autoregressive Image Generation | Jun 19, 2025 | Image GenerationLanguage Modeling | CodeCode Available | 2 |
| Investigating Affective Use and Emotional Well-being on ChatGPT | Apr 4, 2025 | Privacy Preserving | CodeCode Available | 2 |