Chapter Thirteen

Performance and Troubleshooting

A method for finding the bottleneck instead of guessing at it, and the tools that measure CPU, memory, disk, and network so the slow part names itself.

5 topics

Most production slowdowns are diagnosed by intuition, and intuition is wrong often enough to waste hours restarting the wrong service. The fix is a method: measure the load, find the saturated resource, then drill into the one subsystem that is actually constrained instead of tuning four that are not.

This chapter builds that method, then arms it. It starts with the USE-style approach and the baseline tools every Debian and Ubuntu box already ships, then takes each resource in turn — CPU and memory, disk and I/O, the network — and ends with the tracing tools that show you exactly which syscall a stuck process is blocked on. By the end you can walk up to a misbehaving server and have the bottleneck identified in minutes, not by luck.

Troubleshoot by method, not by guessing
Hypothesiswhat + where
Measureagainst a baseline
Localizeresource, then process
Confirmfix, then re-measure

Topics in This Chapter