Explaining what server loads really are and how to control them, how to watch loads and signs of server trouble.
Server Load Explination
The load average tries to measure the number of active processes at any time. As a measure of CPU utilization, the load average is simplistic, poorly defined, but far from useless. High load averages usually mean that the system is being used heavily and the response time is correspondingly slow. What’s high? … Ideally, you’d like a load average under, say, 3, … Ultimately, ‘high’ means high enough so that you don’t need uptime to tell you that the system is overloaded.
When seeing the results of the load averages, they are for the past 1, 5, and 15 minutes.
How to check the servers load?
There are a few different ways to keep an eye on your servers load, the first thing you need to do is login to your server by SSH.
Method 1 – using the uptime command:
The uptime shell command produces the following output:
9:40am up 9 days, 10:36, 4 users, load average: 0.02, 0.01, 0.00
It shows the time since the system was last booted, the number of active user processes and something called the load average.
Method 2 – using the procinfo command:
On Linux systems, the procinfo command produces the following output:
Linux 2.0.36 ([email protected]) (gcc 188.8.131.52) #1 Wed Jul 25 21:40:16 EST 2001 [pax]
Memory: Total Used Free Shared Buffers Cached
Mem: 95564 90252 5312 31412 33104 26412
Swap: 68508 0 68508
Bootup: Sun Jul 21 15:21:15 2002 Load average: 0.15 0.03 0.01 2/58 8557
The load average appears in the lower left corner of this output.
Method 3 – using the w command:
The w command produces the following output:
9:40am up 9 days, 10:35, 4 users, load average: 0.02, 0.01, 0.00
USER TTY FROM [email protected] IDLE JCPU PCPU WHAT
mir ttyp0 :0.0 Fri10pm 3days 0.09s 0.09s bash
neil ttyp2 12-35-86-1.ea.co 9:40am 0.00s 0.29s 0.15s w
Notice that the first line of the output is identical to the output of the uptime command.
Method 4 – using the top command – prefered:
The top command is a more recent addition to the UNIX command set that ranks processes according to the amount of CPU time they consume. It produces the following output:
4:09am up 12:48, 1 user, load average: 0.02, 0.27, 0.17
58 processes: 57 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.5% user, 0.9% system, 0.0% nice, 98.5% idle
Mem: 95564K av, 78704K used, 16860K free, 32836K shrd, 40132K buff
Swap: 68508K av, 0K used, 68508K free 14508K cached
PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
5909 neil 13 0 720 720 552 R 0 1.5 0.7 0:01 top
1 root 0 0 396 396 328 S 0 0.0 0.4 0:02 init
2 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 kflushd
3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00 kswapd
We like to use the top command because it also shows server uptime, memory information and the list of processes that you can sort by CPU usage, etc.
Other system monitoring tools – SIM (System Integrity Monitor)
The folks at R-fx networks have developed this utility that has a variety of features such as;
– Ability to auto restart system with definable critical load level
– System load monitor with customizable warnings & actions
– Priority change configurable for services, at warning or critical load level
What is a good load, bad load and in between?
I know you’re asking, “so what is a good system load or what is a bad load?” Anything around 1.0 and below is fine, try to stick to under 1.0 for regular load averages. If you notice your server slowing down, check the load first. We hosted a site that was mentioned on the media (TV, News, Radio) recently and the server skyrocketed because of the huge wave of traffic. The load went from 0.25 to 37.00 just because the server was getting hammered.
When your regular average starts to creep up around 2.0 then your server is very busy and you should consider getting another machine or upgrading your hardware. When I say regular average, I mean when the system is idle during the day and isn’t processing all your logs or backing up data.
Having an overloaded server can lead to many problems and should always be avoided. I hope this guide was helpful by giving you some more insight to server loads, what to use to monitor them and what is a good and bad load average.