GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1050 of 5629 papers

Title	Date	Tasks	Status	Hype
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing	Oct 7, 2024	GPU	CodeCode Available	1
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation	Oct 7, 2024	GPUMachine Translation	—Unverified	0
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective	Oct 6, 2024	CPUGPU	CodeCode Available	1
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms	Oct 5, 2024	BenchmarkingGPU	—Unverified	0
High-Speed Stereo Visual SLAM for Low-Powered Computing Devices	Oct 5, 2024	GPU	CodeCode Available	3
Fast Object Detection with a Machine Learning Edge Device	Oct 5, 2024	Autonomous NavigationCPU	—Unverified	0
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation	Oct 4, 2024	16kCode Generation	CodeCode Available	3
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning	Oct 4, 2024	CPUDeep Learning	—Unverified	0
Compute Or Load KV Cache? Why Not Both?	Oct 4, 2024	GPU	—Unverified	0
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy	Oct 4, 2024	GPULow-rank compression	—Unverified	0
Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach	Oct 3, 2024	energy managementGPU	CodeCode Available	0
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference Services	Oct 3, 2024	BenchmarkingGPU	CodeCode Available	1
Efficient Semantic Segmentation via Lightweight Multiple-Information Interaction Network	Oct 3, 2024	GPUReal-Time Semantic Segmentation	—Unverified	0
Learning from Offline Foundation Features with Tensor Augmentations	Oct 3, 2024	GPU	—Unverified	0
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping	Oct 3, 2024	GPUMixture-of-Experts	—Unverified	0
LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences	Oct 3, 2024	GPUGraph Neural Network	—Unverified	0
An Efficient Inference Frame for SMLM (Single-Molecule Localization Microscopy)	Oct 3, 2024	Deep LearningGPU	CodeCode Available	0
Contextual Document Embeddings	Oct 3, 2024	Contrastive LearningDocument Embedding	—Unverified	0
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second	Oct 2, 2024	Depth EstimationGPU	CodeCode Available	9
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices	Oct 2, 2024	GPULanguage Modeling	CodeCode Available	1
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts	Oct 2, 2024	4kGPU	—Unverified	0
FlashMask: Efficient and Rich Mask Extension of FlashAttention	Oct 2, 2024	Computational EfficiencyGPU	—Unverified	0
Scalable and Consistent Graph Neural Networks for Distributed Mesh-based Data-driven Modeling	Oct 2, 2024	GPUGraph Neural Network	—Unverified	0
Replacement Learning: Training Vision Tasks with Fewer Learnable Parameters	Oct 2, 2024	GPU	—Unverified	0
ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving	Oct 2, 2024	BenchmarkingDocument Summarization	—Unverified	0
VectorGraphNET: Graph Attention Networks for Accurate Segmentation of Complex Technical Drawings	Oct 2, 2024	GPUGraph Attention	—Unverified	0
TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model Discovery	Oct 2, 2024	GPUModel Discovery	CodeCode Available	1
Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devices	Oct 1, 2024	CPUDeep Reinforcement Learning	CodeCode Available	0
ROK Defense M&S in the Age of Hyperscale AI: Concepts, Challenges, and Future Directions	Oct 1, 2024	Decision MakingGPU	—Unverified	0
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards	Oct 1, 2024	GPUMixture-of-Experts	—Unverified	0
STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting	Oct 1, 2024	GPU	CodeCode Available	1
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management	Oct 1, 2024	GPULanguage Modeling	CodeCode Available	3
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI	Oct 1, 2024	GPUImitation Learning	CodeCode Available	7
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference	Sep 30, 2024	GPUmultimodal generation	—Unverified	0
HEADS-UP: Head-Mounted Egocentric Dataset for Trajectory Prediction in Blind Assistance Systems	Sep 30, 2024	GPUPrediction	—Unverified	0
Simple and Fast Distillation of Diffusion Models	Sep 29, 2024	GPUImage Generation	CodeCode Available	3
Simulation-based inference with the Python Package sbijax	Sep 28, 2024	Bayesian InferenceCPU	—Unverified	0
Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models	Sep 28, 2024	GPU	CodeCode Available	1
Gradient-free Decoder Inversion in Latent Diffusion Models	Sep 27, 2024	DecoderDenoising	—Unverified	0
TensorSocket: Shared Data Loading for Deep Learning Training	Sep 27, 2024	Computational EfficiencyCPU	—Unverified	0
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs	Sep 27, 2024	GPURecommendation Systems	CodeCode Available	1
DRL-STNet: Unsupervised Domain Adaptation for Cross-modality Medical Image Segmentation via Disentangled Representation Learning	Sep 26, 2024	Domain AdaptationGPU	—Unverified	0
Input-Dependent Power Usage in GPUs	Sep 26, 2024	GPU	CodeCode Available	0
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores	Sep 26, 2024	GPUManagement	—Unverified	0
Behaviour4All: in-the-wild Facial Behaviour Analysis Toolkit	Sep 26, 2024	Action Unit DetectionArousal Estimation	—Unverified	0
LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field	Sep 26, 2024	GPUNeRF	CodeCode Available	1
MALPOLON: A Framework for Deep Species Distribution Modeling	Sep 26, 2024	BenchmarkingGPU	CodeCode Available	1
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction	Sep 25, 2024	GPUToken Reduction	CodeCode Available	2
Search for Efficient Large Language Models	Sep 25, 2024	GPUModel Compression	CodeCode Available	1
Efficient and generalizable nested Fourier-DeepONet for three-dimensional geological carbon sequestration	Sep 25, 2024	GPU	CodeCode Available	0

Show:10 25 50

← PrevPage 21 of 113Next →

No leaderboard results yet.