| Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans | Mar 22, 2024 | GPUImage Segmentation | CodeCode Available | 1 |
| DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics | Mar 21, 2024 | GPU | CodeCode Available | 1 |
| On Pretraining Data Diversity for Self-Supervised Learning | Mar 20, 2024 | DiversityGPU | CodeCode Available | 1 |
| Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Mar 18, 2024 | Anomaly DetectionDenoising | CodeCode Available | 1 |
| JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning | Mar 17, 2024 | GPUManagement | CodeCode Available | 1 |
| Optimistic Verifiable Training by Controlling Hardware Nondeterminism | Mar 14, 2024 | Data PoisoningGPU | CodeCode Available | 1 |
| FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical Images | Mar 14, 2024 | 3D Medical Imaging SegmentationGPU | CodeCode Available | 1 |
| SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model | Mar 13, 2024 | Depth EstimationGPU | CodeCode Available | 1 |
| Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment Anything | Mar 12, 2024 | GPUPoint Tracking | CodeCode Available | 1 |
| SSM Meets Video Diffusion Models: Efficient Long-Term Video Generation with Structured State Spaces | Mar 12, 2024 | GPUImage Generation | CodeCode Available | 1 |
| LookupFFN: Making Transformers Compute-lite for CPU inference | Mar 12, 2024 | CPUGPU | CodeCode Available | 1 |
| UniSparse: An Intermediate Language for General Sparse Format Customization | Mar 9, 2024 | AttributeCode Generation | CodeCode Available | 1 |
| LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization | Mar 2, 2024 | GPUQuantization | CodeCode Available | 1 |
| Efficient Lifelong Model Evaluation in an Era of Rapid Progress | Feb 29, 2024 | BenchmarkingGPU | CodeCode Available | 1 |
| DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation | Feb 27, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control | Feb 27, 2024 | GPUImage Retrieval | CodeCode Available | 1 |
| PyGim: An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures | Feb 26, 2024 | CPUGPU | CodeCode Available | 1 |
| Mechanistic Neural Networks for Scientific Machine Learning | Feb 20, 2024 | Equation DiscoveryGPU | CodeCode Available | 1 |
| BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation | Feb 18, 2024 | GPUQuestion Answering | CodeCode Available | 1 |
| Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Feb 15, 2024 | GPUReinforcement Learning (RL) | CodeCode Available | 1 |
| Anchor-based Large Language Models | Feb 12, 2024 | Computational EfficiencyDecoder | CodeCode Available | 1 |
| TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning | Feb 8, 2024 | DenoisingFraud Detection | CodeCode Available | 1 |
| Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes | Feb 8, 2024 | GPU | CodeCode Available | 1 |
| Improving Token-Based World Models with Parallel Observation Prediction | Feb 8, 2024 | GPUPrediction | CodeCode Available | 1 |
| ApiQ: Finetuning of 2-Bit Quantized Large Language Model | Feb 7, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| A Lightweight Inception Boosted U-Net Neural Network for Routability Prediction | Feb 7, 2024 | AvgCPU | CodeCode Available | 1 |
| Pruner: A Speculative Exploration Mechanism to Accelerate Tensor Program Tuning | Feb 4, 2024 | GPUTransfer Learning | CodeCode Available | 1 |
| Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks | Feb 3, 2024 | GPUMolecular Property Prediction | CodeCode Available | 1 |
| InferCept: Efficient Intercept Support for Augmented Large Language Model Inference | Feb 2, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy | Jan 26, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction | Jan 23, 2024 | 3D Semantic Occupancy PredictionAutonomous Driving | CodeCode Available | 1 |
| immrax: A Parallelizable and Differentiable Toolbox for Interval Analysis and Mixed Monotone Reachability in JAX | Jan 21, 2024 | Computational EfficiencyGPU | CodeCode Available | 1 |
| Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices | Jan 17, 2024 | Dynamic neural networksGPU | CodeCode Available | 1 |
| Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jan 16, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital Twins | Jan 6, 2024 | Autonomous VehiclesBenchmarking | CodeCode Available | 1 |
| TinyPredNet: A Lightweight Framework for Satellite Image Sequence Prediction | Jan 1, 2024 | DecoderGPU | CodeCode Available | 1 |
| Resource-Efficient Transformer Pruning for Finetuning of Large Models | Jan 1, 2024 | GPUNatural Language Understanding | CodeCode Available | 1 |
| City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web | Dec 27, 2023 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-order Optimization | Dec 23, 2023 | GPU | CodeCode Available | 1 |
| Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving | Dec 19, 2023 | Autonomous DrivingGPU | CodeCode Available | 1 |
| Enhancing predictive capabilities in fusion burning plasmas through surrogate-based optimization in core transport solvers | Dec 19, 2023 | GPUPrediction | CodeCode Available | 1 |
| Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models | Dec 19, 2023 | GPU | CodeCode Available | 1 |
| Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs | Dec 16, 2023 | GPUScheduling | CodeCode Available | 1 |
| Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models | Dec 15, 2023 | BenchmarkingCode Summarization | CodeCode Available | 1 |
| Data-Efficient Multimodal Fusion on a Single GPU | Dec 15, 2023 | GPUImage Retrieval | CodeCode Available | 1 |
| MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training | Dec 14, 2023 | GPU | CodeCode Available | 1 |
| Memory-Efficient Reversible Spiking Neural Networks | Dec 13, 2023 | GPU | CodeCode Available | 1 |
| EZ-CLIP: Efficient Zeroshot Video Action Recognition | Dec 13, 2023 | Action RecognitionGPU | CodeCode Available | 1 |
| DTL: Disentangled Transfer Learning for Visual Recognition | Dec 13, 2023 | GPUTransfer Learning | CodeCode Available | 1 |
| Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI | Dec 13, 2023 | DiversityGPU | CodeCode Available | 1 |