| TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models | Dec 1, 2023 | Image ClassificationMulti-Object Tracking | CodeCode Available | 2 | 5 |
| iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform | Mar 4, 2022 | Speech Synthesistext-to-speech | CodeCode Available | 2 | 5 |
| Online Video Understanding: OVBench and VideoChat-Online | Dec 31, 2024 | Autonomous DrivingQuestion Answering | CodeCode Available | 2 | 5 |
| Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details | Feb 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 2 | 5 |
| Task-wise Sampling Convolutions for Arbitrary-Oriented Object Detection in Aerial Images | Sep 6, 2022 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| On the Continuity of Rotation Representations in Neural Networks | Dec 17, 2018 | | CodeCode Available | 2 | 5 |
| PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters | Mar 25, 2023 | 3D Architecture3D Reconstruction | CodeCode Available | 2 | 5 |
| Evaluation of Bio-Inspired Models under Different Learning Settings For Energy Efficiency in Network Traffic Prediction | Dec 23, 2024 | Privacy PreservingTraffic Prediction | CodeCode Available | 2 | 5 |
| Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis | Dec 9, 2024 | Gesture GenerationRAG | CodeCode Available | 2 | 5 |
| Chimp: Efficient Lossless Floating Point Compression for Time Series Databases | Jul 1, 2022 | AstronomyTime Series | CodeCode Available | 2 | 5 |
| Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions | Oct 28, 2023 | State Space Models | CodeCode Available | 2 | 5 |
| VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design | Jul 31, 2023 | Computational Efficiencytext-to-speech | CodeCode Available | 2 | 5 |
| BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI | Oct 14, 2024 | Contrastive Learning | CodeCode Available | 2 | 5 |
| DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory | Aug 16, 2023 | Trajectory ModelingVideo Generation | CodeCode Available | 2 | 5 |
| MOMENT: A Family of Open Time-series Foundation Models | Feb 6, 2024 | Time SeriesTime Series Analysis | CodeCode Available | 2 | 5 |
| OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding | May 18, 2023 | 3D Classification3D Shape Representation | CodeCode Available | 2 | 5 |
| ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks | Aug 20, 2024 | | CodeCode Available | 2 | 5 |
| Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly Detection | Feb 21, 2025 | 3D Anomaly Detection3D Anomaly Detection and Segmentation | CodeCode Available | 2 | 5 |
| CODA: Repurposing Continuous VAEs for Discrete Tokenization | Mar 22, 2025 | | CodeCode Available | 2 | 5 |
| SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity | Mar 26, 2025 | Test-time Adaptation | CodeCode Available | 2 | 5 |
| Optimisation & Generalisation in Networks of Neurons | Oct 18, 2022 | | CodeCode Available | 2 | 5 |
| Leveraging medical Twitter to build a visual–language foundation model for pathology AI | Apr 1, 2023 | Transfer Learning | CodeCode Available | 2 | 5 |
| VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Mar 15, 2024 | EgoSchemaForm | CodeCode Available | 2 | 5 |
| Instant Volumetric Head Avatars | Nov 22, 2022 | Face ModelGPU | CodeCode Available | 2 | 5 |
| Efficient and Effective SPARQL Autocompletion on Very Large Knowledge Graphs | Oct 17, 2022 | Knowledge Graphs | CodeCode Available | 2 | 5 |
| Proximal Policy Optimization Algorithms | Jul 20, 2017 | Continuous ControlDota 2 | CodeCode Available | 2 | 5 |
| CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field | Mar 24, 2024 | NeRFNovel View Synthesis | CodeCode Available | 2 | 5 |
| Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy | Mar 25, 2025 | DenoisingRobot Manipulation | CodeCode Available | 2 | 5 |
| Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis | Apr 26, 2023 | Speech Synthesistext-to-speech | CodeCode Available | 2 | 5 |
| Directly Fine-Tuning Diffusion Models on Differentiable Rewards | Sep 29, 2023 | | CodeCode Available | 2 | 5 |
| A Dataset and Explorer for 3D Signed Distance Functions | Apr 27, 2022 | GPU | CodeCode Available | 2 | 5 |
| U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | Jan 9, 2024 | Cell SegmentationImage Segmentation | CodeCode Available | 2 | 5 |
| Semantic Photo Manipulation with a Generative Image Prior | May 15, 2020 | | CodeCode Available | 2 | 5 |
| VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition | Sep 9, 2020 | CPUspeech-recognition | CodeCode Available | 2 | 5 |
| Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields | May 2, 2024 | Decoder | CodeCode Available | 2 | 5 |
| FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence | Jan 21, 2020 | Image ClassificationPseudo Label | CodeCode Available | 2 | 5 |
| The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks | Feb 12, 2025 | | CodeCode Available | 2 | 5 |
| Freeing Hybrid Distributed AI Training Configuration | Aug 20, 2021 | | CodeCode Available | 2 | 5 |
| Omnipose: a high-precision, morphology-independent solution for bacterial cell segmentation | Nov 5, 2021 | Cell SegmentationVocal Bursts Intensity Prediction | CodeCode Available | 2 | 5 |
| Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation | Jul 26, 2024 | Knowledge DistillationQuestion Answering | CodeCode Available | 2 | 5 |
| CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge | Feb 12, 2024 | General KnowledgeMultiple-choice | CodeCode Available | 2 | 5 |
| Enhancing Video Super-Resolution via Implicit Resampling-based Alignment | Apr 29, 2023 | Super-ResolutionVideo Super-Resolution | CodeCode Available | 2 | 5 |
| Algorithm Evolution Using Large Language Model | Nov 26, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Efficient Neural Network Analysis with Sum-of-Infeasibilities | Mar 19, 2022 | Adversarial AttackEfficient Neural Network | CodeCode Available | 2 | 5 |
| SyncTweedies: A General Generative Framework Based on Synchronized Diffusions | Mar 21, 2024 | Denoising | CodeCode Available | 2 | 5 |
| Visual Programming: Compositional visual reasoning without training | Nov 18, 2022 | In-Context LearningQuestion Answering | CodeCode Available | 2 | 5 |
| CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion | Dec 2, 2022 | 3D Object TrackingAutonomous Vehicles | CodeCode Available | 2 | 5 |
| Interactive Differentiable Simulation | May 26, 2019 | Model Predictive Controlparameter estimation | CodeCode Available | 2 | 5 |
| Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour | Jun 8, 2017 | Stochastic Optimization | CodeCode Available | 2 | 5 |
| Single-View View Synthesis in the Wild with Learned Adaptive Multiplane Images | May 24, 2022 | 3D geometryDepth Estimation | CodeCode Available | 2 | 5 |