| Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT | Feb 12, 2024 | BenchmarkingChunking | —Unverified | 0 |
| Context-aware Multi-Model Object Detection for Diversely Heterogeneous Compute Systems | Feb 12, 2024 | GPUobject-detection | CodeCode Available | 0 |
| Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models | Feb 10, 2024 | CPUGPU | CodeCode Available | 3 |
| Cardiac ultrasound simulation for autonomous ultrasound navigation | Feb 9, 2024 | DiagnosticGPU | —Unverified | 0 |
| On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference | Feb 9, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning | Feb 8, 2024 | DenoisingFraud Detection | CodeCode Available | 1 |
| On the Convergence of Zeroth-Order Federated Tuning for Large Language Models | Feb 8, 2024 | Federated LearningGPU | —Unverified | 0 |
| Anatomizing Deep Learning Inference in Web Browsers | Feb 8, 2024 | CPUDeep Learning | —Unverified | 0 |
| Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes | Feb 8, 2024 | GPU | CodeCode Available | 1 |
| Improving Token-Based World Models with Parallel Observation Prediction | Feb 8, 2024 | GPUPrediction | CodeCode Available | 1 |