| Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models | Jan 1, 2024 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation | Jul 24, 2024 | BenchmarkingHuman Animation | CodeCode Available | 3 | 5 |
| Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision | May 4, 2023 | DiversityIn-Context Learning | CodeCode Available | 3 | 5 |
| PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask | Dec 22, 2024 | In-Context LearningVirtual Try-on | CodeCode Available | 3 | 5 |
| Interpretable Differencing of Machine Learning Models | Jun 10, 2023 | Classification | CodeCode Available | 3 | 5 |
| Enhancing End-to-End Autonomous Driving with Latent World Model | Jun 12, 2024 | Autonomous DrivingNavSim | CodeCode Available | 3 | 5 |
| GNM: A General Navigation Model to Drive Any Robot | Oct 7, 2022 | | CodeCode Available | 3 | 5 |
| Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Jan 26, 2023 | Instance Segmentationobject-detection | CodeCode Available | 3 | 5 |
| FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models | Jun 7, 2024 | Federated Learning | CodeCode Available | 3 | 5 |
| Differentiable Voxel-based X-ray Rendering Improves Sparse-View 3D CBCT Reconstruction | Nov 28, 2024 | 3D ReconstructionDiagnostic | CodeCode Available | 3 | 5 |
| COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations | Apr 25, 2024 | Contrastive LearningMusic Generation | CodeCode Available | 3 | 5 |
| From human experts to machines: An LLM supported approach to ontology and knowledge graph construction | Mar 13, 2024 | graph constructionKnowledge Graphs | CodeCode Available | 3 | 5 |
| vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention | May 7, 2024 | GPUManagement | CodeCode Available | 3 | 5 |
| VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | Mar 17, 2025 | Grounded Video Question AnsweringQuestion Answering | CodeCode Available | 3 | 5 |
| AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation | Oct 8, 2024 | DenoisingImage Generation | CodeCode Available | 3 | 5 |
| Advancing Speech Language Models by Scaling Supervised Fine-Tuning with Over 60,000 Hours of Synthetic Speech Dialogue Data | Dec 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Deep Learning Alternatives of the Kolmogorov Superposition Theorem | Oct 2, 2024 | Deep LearningKolmogorov-Arnold Networks | CodeCode Available | 3 | 5 |
| Transolver: A Fast Transformer Solver for PDEs on General Geometries | Feb 4, 2024 | | CodeCode Available | 3 | 5 |
| FNSPID: A Comprehensive Financial News Dataset in Time Series | Feb 9, 2024 | Financial AnalysisTime Series | CodeCode Available | 3 | 5 |
| An Improved RaftStereo Trained with A Mixed Dataset for the Robust Vision Challenge 2022 | Oct 23, 2022 | Stereo Matching | CodeCode Available | 3 | 5 |
| In-Context Learning for Extreme Multi-Label Classification | Jan 22, 2024 | ClassificationExtreme Multi-Label Classification | CodeCode Available | 3 | 5 |
| SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities | May 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models | Mar 31, 2024 | Image-text RetrievalLanguage Modeling | CodeCode Available | 3 | 5 |
| ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders | Jan 2, 2023 | Object DetectionRepresentation Learning | CodeCode Available | 3 | 5 |
| SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap | Apr 17, 2024 | Camera CalibrationGame State Reconstruction | CodeCode Available | 3 | 5 |