| Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches | Aug 20, 2024 | GPUModel Compression | —Unverified | 0 |
| Near, far: Patch-ordering enhances vision foundation models' scene understanding | Aug 20, 2024 | GPUScene Understanding | —Unverified | 0 |
| LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models | Aug 20, 2024 | GPU | CodeCode Available | 0 |
| Accelerating Goal-Conditioned RL Algorithms and Research | Aug 20, 2024 | GPUreinforcement-learning | CodeCode Available | 3 |
| Stream-Based Ground Segmentation for Real-Time LiDAR Point Cloud Processing on FPGA | Aug 19, 2024 | CPUGPU | —Unverified | 0 |
| Characteristic Performance Study on Solving Oscillator ODEs via Soft-constrained Physics-informed Neural Network with Small Data | Aug 19, 2024 | CPUGPU | CodeCode Available | 0 |
| MoDeGPT: Modular Decomposition for Large Language Model Compression | Aug 19, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Liquid Fourier Latent Dynamics Networks for fast GPU-based numerical simulations in computational cardiology | Aug 19, 2024 | GPU | CodeCode Available | 0 |
| SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training | Aug 19, 2024 | GPULanguage Modeling | —Unverified | 0 |
| TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Aug 19, 2024 | GPUMulti-Task Learning | CodeCode Available | 0 |
| Demystifying the Communication Characteristics for Distributed Transformer Models | Aug 19, 2024 | Audio GenerationGPU | —Unverified | 0 |
| Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs | Aug 18, 2024 | DiversityGPU | —Unverified | 0 |
| ELASTIC: Efficient Linear Attention for Sequential Interest Compression | Aug 18, 2024 | Computational EfficiencyGPU | —Unverified | 0 |
| ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models | Aug 16, 2024 | GPUModel Compression | CodeCode Available | 3 |
| Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems | Aug 14, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference | Aug 14, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Review Learning: Advancing All-in-One Ultra-High-Definition Image Restoration Training Method | Aug 13, 2024 | 4kAll | —Unverified | 0 |
| Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions | Aug 13, 2024 | GPUKnowledge Graph Completion | —Unverified | 0 |
| Breast-NET: a lightweight DCNN model for breast cancer detection and grading using histological samples | Aug 10, 2024 | Breast Cancer DetectionBreast Cancer Histology Image Classification | CodeCode Available | 0 |
| LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale | Aug 10, 2024 | GPULanguage Modelling | CodeCode Available | 3 |
| A Versatile Framework for Attributed Network Clustering via K-Nearest Neighbor Augmentation | Aug 10, 2024 | AttributeClustering | CodeCode Available | 0 |
| UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling | Aug 9, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive Learning | Aug 9, 2024 | Contrastive LearningData Augmentation | CodeCode Available | 0 |
| Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications | Aug 9, 2024 | Deep LearningGPU | CodeCode Available | 0 |
| An Edge AI System Based on FPGA Platform for Railway Fault Detection | Aug 8, 2024 | CPUFault Detection | —Unverified | 0 |