| FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance | May 9, 2023 | | CodeCode Available | 2 |
| ChemDFM: A Large Language Foundation Model for Chemistry | Jan 26, 2024 | Formmodel | CodeCode Available | 2 |
| Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement | Dec 21, 2024 | Mamba | CodeCode Available | 2 |
| ETTA: Elucidating the Design Space of Text-to-Audio Models | Dec 26, 2024 | AudioCapsAudio captioning | CodeCode Available | 2 |
| FastSpeech: Fast,Robustand Controllable Text-to-Speech | May 22, 2019 | Decodertext-to-speech | CodeCode Available | 2 |
| Pruning Filters for Efficient ConvNets | Aug 31, 2016 | Image ClassificationNetwork Pruning | CodeCode Available | 2 |
| DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation | Jul 4, 2023 | 3D Shape GenerationDenoising | CodeCode Available | 2 |
| MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data | Jun 26, 2024 | BenchmarkingMath | CodeCode Available | 2 |
| PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration | Jun 28, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model | Nov 10, 2023 | DiversityNeRF | CodeCode Available | 2 |
| MaskTerial: A Foundation Model for Automated 2D Material Flake Detection | Dec 12, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Sensitive Data Detection with High-Throughput Neural Network Models for Financial Institutions | Dec 17, 2020 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 2 |
| LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Oct 28, 2024 | Video GenerationVideo Reconstruction | CodeCode Available | 2 |
| Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency | Dec 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| AlignXIE: Improving Multilingual Information Extraction by Cross-Lingual Alignment | Nov 7, 2024 | Code Generation | CodeCode Available | 2 |
| Counterfactual Phenotyping with Censored Time-to-Events | Feb 22, 2022 | counterfactualCounterfactual Reasoning | CodeCode Available | 2 |
| FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF | Dec 20, 2024 | Privacy Preservingreinforcement-learning | CodeCode Available | 2 |
| Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval | Oct 28, 2024 | Image RetrievalImage to text | CodeCode Available | 2 |
| Concat-ID: Towards Universal Identity-Preserving Video Synthesis | Mar 18, 2025 | Human-Domain Subject-to-VideoVideo Generation | CodeCode Available | 2 |
| MemSeg: A semi-supervised method for image surface defect detection using differences and commonalities | May 2, 2022 | Anomaly DetectionDefect Detection | CodeCode Available | 2 |
| Event-Based Motion Magnification | Feb 19, 2024 | BenchmarkingMotion Detection | CodeCode Available | 2 |
| EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies | Mar 25, 2023 | Anomaly DetectionComputational Efficiency | CodeCode Available | 2 |
| Motion Inversion for Video Customization | Mar 29, 2024 | Video Generation | CodeCode Available | 2 |
| Patchwork++: Fast and Robust Ground Segmentation Solving Partial Under-Segmentation Using 3D Point Cloud | Jul 25, 2022 | Object RecognitionSegmentation | CodeCode Available | 2 |
| Multi-Document Grounded Multi-Turn Synthetic Dialog Generation | Sep 17, 2024 | | CodeCode Available | 2 |
| Panacea: Panoramic and Controllable Video Generation for Autonomous Driving | Nov 28, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| YAKE! Keyword extraction from single documents using multiple local features | Mar 1, 2018 | Keyword ExtractionNatural Language Understanding | CodeCode Available | 2 |
| Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI | Nov 22, 2024 | counterfactualCounterfactual Explanation | CodeCode Available | 2 |
| MetaFed: Federated Learning among Federations with Cyclic Knowledge Distillation for Personalized Healthcare | Jun 17, 2022 | Federated LearningKnowledge Distillation | CodeCode Available | 2 |
| Prompt-CAM: A Simpler Interpretable Transformer for Fine-Grained Analysis | Jan 16, 2025 | Explainable Artificial Intelligence (XAI)Explainable Models | CodeCode Available | 2 |
| cuSLINK: Single-linkage Agglomerative Clustering on the GPU | Jun 28, 2023 | ClusteringGPU | CodeCode Available | 2 |
| Consistency Diffusion Bridge Models | Oct 30, 2024 | DenoisingImage-to-Image Translation | CodeCode Available | 2 |
| A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model | Jul 22, 2024 | Diagnosticwhole slide images | CodeCode Available | 2 |
| Pillar R-CNN for Point Cloud 3D Object Detection | Feb 26, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Compiler Optimization via LLM Reasoning for Efficient Model Serving | Jun 2, 2025 | Compiler OptimizationLarge Language Model | CodeCode Available | 2 |
| EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation | Mar 20, 2025 | Optical Flow EstimationVideo Frame Interpolation | CodeCode Available | 2 |
| Garment3DGen: 3D Garment Stylization and Texture Generation | Mar 27, 2024 | Image to 3DTexture Synthesis | CodeCode Available | 2 |
| TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models | Dec 1, 2023 | Image ClassificationMulti-Object Tracking | CodeCode Available | 2 |
| iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform | Mar 4, 2022 | Speech Synthesistext-to-speech | CodeCode Available | 2 |
| Online Video Understanding: OVBench and VideoChat-Online | Dec 31, 2024 | Autonomous DrivingQuestion Answering | CodeCode Available | 2 |
| Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details | Feb 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 2 |
| Task-wise Sampling Convolutions for Arbitrary-Oriented Object Detection in Aerial Images | Sep 6, 2022 | object-detectionObject Detection | CodeCode Available | 2 |
| On the Continuity of Rotation Representations in Neural Networks | Dec 17, 2018 | | CodeCode Available | 2 |
| PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters | Mar 25, 2023 | 3D Architecture3D Reconstruction | CodeCode Available | 2 |
| Evaluation of Bio-Inspired Models under Different Learning Settings For Energy Efficiency in Network Traffic Prediction | Dec 23, 2024 | Privacy PreservingTraffic Prediction | CodeCode Available | 2 |
| Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis | Dec 9, 2024 | Gesture GenerationRAG | CodeCode Available | 2 |
| Chimp: Efficient Lossless Floating Point Compression for Time Series Databases | Jul 1, 2022 | AstronomyTime Series | CodeCode Available | 2 |
| Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions | Oct 28, 2023 | State Space Models | CodeCode Available | 2 |
| VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design | Jul 31, 2023 | Computational Efficiencytext-to-speech | CodeCode Available | 2 |
| BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI | Oct 14, 2024 | Contrastive Learning | CodeCode Available | 2 |