| RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation | Aug 5, 2024 | | CodeCode Available | 4 |
| CraftsMan3D: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner | May 23, 2024 | 3D Generation3D geometry | CodeCode Available | 4 |
| UniTok: A Unified Tokenizer for Visual Generation and Understanding | Feb 27, 2025 | Quantization | CodeCode Available | 4 |
| LangCell: Language-Cell Pre-training for Cell Identity Understanding | May 9, 2024 | | CodeCode Available | 4 |
| RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation | May 1, 2024 | Optical Flow Estimation | CodeCode Available | 4 |
| Kwai Keye-VL Technical Report | Jul 2, 2025 | Instruction FollowingReinforcement Learning (RL) | CodeCode Available | 4 |
| Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs | Jun 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Towards One-shot Federated Learning: Advances, Challenges, and Future Directions | May 5, 2025 | Federated LearningSurvey | CodeCode Available | 4 |
| s3: You Don't Need That Much Data to Train a Search Agent via RL | May 20, 2025 | RAGReinforcement Learning (RL) | CodeCode Available | 4 |
| lmgame-Bench: How Good are LLMs at Playing Games? | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation | May 26, 2025 | Human-Domain Subject-to-VideoOpen-Domain Subject-to-Video | CodeCode Available | 4 |
| DemoFusion: Democratising High-Resolution Image Generation With No $ | Nov 24, 2023 | Image Generation | CodeCode Available | 4 |
| Look Once to Hear: Target Speech Hearing with Noisy Examples | May 10, 2024 | CPUSpeech Extraction | CodeCode Available | 4 |
| The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Feb 29, 2024 | AllHallucination | CodeCode Available | 4 |
| APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay | Apr 4, 2025 | | CodeCode Available | 4 |
| Eureka: Human-Level Reward Design via Coding Large Language Models | Oct 19, 2023 | Decision MakingIn-Context Learning | CodeCode Available | 4 |
| High Fidelity Neural Audio Compression | Oct 24, 2022 | Audio CompressionAudio Signal Processing | CodeCode Available | 4 |
| MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis | Jul 2, 2024 | AttributeImage Generation | CodeCode Available | 4 |
| Qiskit Machine Learning: an open-source library for quantum machine learning tasks at scale on quantum hardware and classical simulators | May 23, 2025 | Quantum Machine Learning | CodeCode Available | 4 |
| StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis | Jun 19, 2022 | Generative Adversarial NetworkImage Generation | CodeCode Available | 4 |
| CoTracker: It is Better to Track Together | Jul 14, 2023 | GPUmotion prediction | CodeCode Available | 4 |
| PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts | Feb 2, 2022 | | CodeCode Available | 4 |
| Leveraging Speculative Sampling and KV-Cache Optimizations Together for Generative AI using OpenVINO | Nov 8, 2023 | QuantizationText Generation | CodeCode Available | 4 |
| Context-Aware Drift Detection | Mar 16, 2022 | Drift Detection | CodeCode Available | 4 |
| A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation | Mar 18, 2022 | Music Transcription | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector | May 30, 2022 | Co-Salient Object DetectionObject | CodeCode Available | 4 |
| RLlib: Abstractions for Distributed Reinforcement Learning | Dec 26, 2017 | reinforcement-learningReinforcement Learning | CodeCode Available | 4 |
| Vision + Language Applications: A Survey | May 24, 2023 | Image GenerationSurvey | CodeCode Available | 4 |
| A Survey on Large Language Model-Based Game Agents | Apr 2, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization | Aug 21, 2022 | Abstractive Text SummarizationDecoder | CodeCode Available | 4 |
| R^3LIVE++: A Robust, Real-time, Radiance reconstruction package with a tightly-coupled LiDAR-Inertial-Visual state Estimator | Sep 8, 2022 | Self-Driving CarsSimultaneous Localization and Mapping | CodeCode Available | 4 |
| Recent Advances in RecBole: Extensions with more Practical Considerations | Nov 28, 2022 | | CodeCode Available | 4 |
| Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation | Dec 22, 2022 | Style TransferText-to-Video Generation | CodeCode Available | 4 |
| Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models | Sep 25, 2024 | Image Captioning | CodeCode Available | 4 |
| AudioLDM: Text-to-Audio Generation with Latent Diffusion Models | Jan 29, 2023 | AudioCapsAudio Generation | CodeCode Available | 4 |
| LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders | May 24, 2025 | Adversarial RobustnessOut-of-Distribution Generalization | CodeCode Available | 4 |
| Transcoders Beat Sparse Autoencoders for Interpretability | Jan 31, 2025 | | CodeCode Available | 4 |
| Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light | Apr 23, 2025 | | CodeCode Available | 4 |
| Memory-aided Contrastive Consensus Learning for Co-salient Object Detection | Feb 28, 2023 | Co-Salient Object Detectionobject-detection | CodeCode Available | 4 |
| ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge | Mar 24, 2023 | Information RetrievalLanguage Modeling | CodeCode Available | 4 |
| A Survey on Large Language Models for Recommendation | May 31, 2023 | Recommendation Systems | CodeCode Available | 4 |
| Segment Anything in Medical Images | Apr 24, 2023 | DiagnosticImage Segmentation | CodeCode Available | 4 |
| mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality | Apr 27, 2023 | Visual Question Answering (VQA)Zero-Shot Video Question Answer | CodeCode Available | 4 |
| The Ideal Continual Learner: An Agent That Never Forgets | Apr 29, 2023 | Continual LearningGeneralization Bounds | CodeCode Available | 4 |
| OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics | Jan 22, 2024 | object-detectionObject Detection | CodeCode Available | 4 |
| The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot | Jun 29, 2023 | Image SegmentationSemantic Segmentation | CodeCode Available | 4 |
| Turning Whisper into Real-Time Transcription System | Jul 27, 2023 | speech-recognitionSpeech Recognition | CodeCode Available | 4 |
| EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models | Mar 18, 2024 | | CodeCode Available | 4 |
| Neural general circulation models optimized to predict satellite-based precipitation observations | Dec 16, 2024 | | CodeCode Available | 4 |