Characterizing Software Aging in GPU-Based LLM Serving Systems

Domenico Cotroneo, Bojan Cukic

Thursday at 04:00

1 Views

0 Comments

arXiv:2606.11916v1 Announce Type: cross Abstract: This paper proposes an empirical methodology to study software aging in GPU-based LLM serving systems. Traditional aging studies focus on CPU-centric software with relatively regular workloads; LLM serving is different, spanning a Python host and a CUDA device, handling requests whose cost varies...

Read the full article at the source.

Read Original Article

Was this helpful?