| Evaluation Report on MCP Servers | Apr 15, 2025 | Large Language Model | CodeCode Available | 3 | 5 |
| ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation | May 24, 2025 | BenchmarkingChart Understanding | CodeCode Available | 3 | 5 |
| UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving | Dec 6, 2024 | Autonomous DrivingDiversity | CodeCode Available | 3 | 5 |
| RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction | Aug 16, 2024 | | CodeCode Available | 3 | 5 |
| Privacy-Preserving Tree-Based Inference with TFHE | Feb 13, 2023 | Privacy Preserving | CodeCode Available | 3 | 5 |
| Retrieval Head Mechanistically Explains Long-Context Factuality | Apr 24, 2024 | Continual PretrainingHallucination | CodeCode Available | 3 | 5 |
| Addressing Representation Collapse in Vector Quantized Models with One Linear Layer | Nov 4, 2024 | QuantizationRepresentation Learning | CodeCode Available | 3 | 5 |
| Simple and Fast Distillation of Diffusion Models | Sep 29, 2024 | GPUImage Generation | CodeCode Available | 3 | 5 |
| The Mamba in the Llama: Distilling and Accelerating Hybrid Models | Aug 27, 2024 | GPULanguage Modeling | CodeCode Available | 3 | 5 |
| Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting | Dec 6, 2023 | Simultaneous Localization and Mapping | CodeCode Available | 3 | 5 |
| UNETR: Transformers for 3D Medical Image Segmentation | Mar 18, 2021 | 3D Medical Imaging SegmentationDecoder | CodeCode Available | 3 | 5 |
| A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Jun 20, 2024 | Video Editing | CodeCode Available | 3 | 5 |
| Nimbus: Secure and Efficient Two-Party Inference for Transformers | Nov 24, 2024 | | CodeCode Available | 3 | 5 |
| uniGradICON: A Foundation Model for Medical Image Registration | Mar 9, 2024 | Image RegistrationMedical Image Registration | CodeCode Available | 3 | 5 |
| Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet | Feb 22, 2022 | Speech Synthesis | CodeCode Available | 3 | 5 |
| ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions | Mar 12, 2024 | Prediction | CodeCode Available | 3 | 5 |
| Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach | May 24, 2024 | ClusteringSelf-Supervised Learning | CodeCode Available | 3 | 5 |
| Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model | Nov 29, 2023 | DiversityLanguage Modeling | CodeCode Available | 3 | 5 |
| MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model | Nov 1, 2022 | Anomaly DetectionBrain Tumor Segmentation | CodeCode Available | 3 | 5 |
| HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis | Nov 21, 2023 | Speech SynthesisSuper-Resolution | CodeCode Available | 3 | 5 |
| SceneCraft: Layout-Guided 3D Scene Generation | Oct 11, 2024 | 3D GenerationImage Generation | CodeCode Available | 3 | 5 |
| Flash-VStream: Efficient Real-Time Understanding for Long Video Streams | Jun 30, 2025 | cross-modal alignmentEgoSchema | CodeCode Available | 3 | 5 |
| Local All-Pair Correspondence for Point Tracking | Jul 22, 2024 | AllPoint Tracking | CodeCode Available | 3 | 5 |
| The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Safety Analysis | Feb 13, 2025 | Safety Alignment | CodeCode Available | 3 | 5 |
| Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization | Jan 1, 2025 | News RetrievalRetrieval | CodeCode Available | 3 | 5 |
| Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction | Sep 22, 2023 | Dynamic ReconstructionNeural Rendering | CodeCode Available | 3 | 5 |
| MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library | Oct 11, 2022 | Multi-agent Reinforcement Learningreinforcement-learning | CodeCode Available | 3 | 5 |
| UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation | Apr 1, 2022 | Brain Tumor SegmentationImage Segmentation | CodeCode Available | 3 | 5 |
| GraphNeuralNetworks.jl: Deep Learning on Graphs with Julia | Dec 9, 2024 | Deep LearningGPU | CodeCode Available | 3 | 5 |
| ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL | Feb 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| A Simple Framework for Open-Vocabulary Segmentation and Detection | Mar 14, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 | 5 |
| LinFusion: 1 GPU, 1 Minute, 16K Image | Sep 3, 2024 | 16kCausal Inference | CodeCode Available | 3 | 5 |
| CHESS: Contextual Harnessing for Efficient SQL Synthesis | May 27, 2024 | Large Language ModelPrivacy Preserving | CodeCode Available | 3 | 5 |
| Flexible and Scalable Deep Learning with MMLSpark | Apr 11, 2018 | Deep LearningDistributed Computing | CodeCode Available | 3 | 5 |
| A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness | Nov 4, 2024 | Question AnsweringText Generation | CodeCode Available | 3 | 5 |
| Why Transformers Need Adam: A Hessian Perspective | Feb 26, 2024 | | CodeCode Available | 3 | 5 |
| LiftFeat: 3D Geometry-Aware Local Feature Matching | May 6, 2025 | 3D geometryDepth Estimation | CodeCode Available | 3 | 5 |
| An Empirical Study on Prompt Compression for Large Language Models | Apr 24, 2025 | ArticlesMath | CodeCode Available | 3 | 5 |
| This Time is Different: An Observability Perspective on Time Series Foundation Models | May 20, 2025 | DecoderMultivariate Time Series Forecasting | CodeCode Available | 3 | 5 |
| Image and Video Tokenization with Binary Spherical Quantization | Jun 11, 2024 | DecoderImage Generation | CodeCode Available | 3 | 5 |
| VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation | May 26, 2025 | DecoderLanguage Modeling | CodeCode Available | 3 | 5 |
| Distilling LLM Agent into Small Models with Retrieval and Code Tools | May 23, 2025 | Action GenerationDomain Generalization | CodeCode Available | 3 | 5 |
| Highly Compressed Tokenizer Can Generate Without Training | Jun 9, 2025 | Image GenerationQuantization | CodeCode Available | 3 | 5 |
| When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation | Jun 6, 2025 | RAGRetrieval | CodeCode Available | 3 | 5 |
| Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens | Jun 20, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 | 5 |
| Discrete Diffusion in Large Language and Multimodal Models: A Survey | Jun 16, 2025 | Denoising | CodeCode Available | 3 | 5 |
| Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models | Jun 23, 2025 | Domain AdaptationGPU | CodeCode Available | 3 | 5 |
| FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language | Jun 26, 2025 | All | CodeCode Available | 3 | 5 |
| No time to train! Training-Free Reference-Based Instance Segmentation | Jul 3, 2025 | Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection | CodeCode Available | 3 | 5 |
| BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response | Jan 10, 2025 | AllBuilding change detection for remote sensing images | CodeCode Available | 3 | 5 |