官术网_书友最值得收藏!

Resource Monitoring

For servers or workstations to be responsive and to be kept from being overloaded, it is also worth monitoring system usage using various additonal measures. Nagios offers several plugins to monitor resource usage and to report if the limits set for these checks are exceeded.

System Load

The first thing that should always be monitored is the system load. This value reflects the number of processes and the amount of CPU capacity that they are utilizing. This means that if one process is using up to 50% of the CPU capacity, the value will be around 0.5; and if four processes try to utilize the maximum CPU capacity, the value will be around 4.0. The system load is measured in three values—the average loads in the last minute, last 5 minutes, and the last 15 minutes. The syntax of the command is as follows:

check_swap [-r] –w wload1,wload5,wload15 –c cload1,cload5,cload15

Values for the -w and -c options should be in the form of three values separated by commas. If any of the load averages exceeds the specified limits, a warning, or critical status will be returned, respectively. Here is a sample command definition that uses warning and critical load limits as arguments:

  define command
  {
    command_name  check_load
    command_line  $USER1$/check_load –w $ARG1$ -c $ARG2$
  }

Checking Processes

Nagios also offers a way to monitor the total number of processes. Nagios can be configured to monitor all processes, only running ones, those consuming CPU, those consuming memory, or a combination of these criteria. The syntax and options are as follows:

check_procs -w <range> -c <range> [-m metric] [-s state]
            [-p ppid] [-u user] [-r rss] [-z vsz] [-P %cpu]
            [-a argument-array] [-C command] [-t timeout] [-v]

Values for the -w and -c options can either take a single value, or take the form of <min>:<max>. In the first case, a warning or critical state is returned if the value (number of processes by default) exceeds the specified number. In the second case, the appropriate status is returned if the value is lower than <min> or higher than <max>. Sample commands to monitor the total number of processes and to monitor the number of specific processes are as follows. The second code, for example, can be used to check to see if the specific server is running, and has not created too many processes. In this case, warning or critical values should be specified ranging from 1.

  define command
  {
    command_name  check_procs_num
    command_line  $USER1$/check_procs –m PROCS –w $ARG1$ -c $ARG2$
  }
  define command
  {
    command_name  check_procs_cmd
    command_line  $USER1$/check_procs –C $ARG1$ –w $ARG1$ -c $ARG2$ 
  }

Monitoring Logged-in Users

It is also possible to use Nagios to monitor the number of users currently logged in to a particular machine. The syntax is very simple and there are the no options, except for warning and critical limits.

check_users -w limit -c limit

A command definition that uses warning or critical limits specified in the arguments is as follows:

  define command
  {
    command_name  check_users
    command_line  $USER1$/check_users –w $ARG1$ -c $ARG2$
  }
主站蜘蛛池模板: 进贤县| 公安县| 米易县| 青浦区| 上高县| 厦门市| 武义县| 山西省| 亚东县| 赤水市| 金堂县| 九龙县| 辽中县| 鹤山市| 尉犁县| 罗田县| 民勤县| 齐齐哈尔市| 苏州市| 秦皇岛市| 隆回县| 和林格尔县| 蓬莱市| 南木林县| 祁阳县| 济源市| 邛崃市| 南宁市| 宜宾市| 从江县| 刚察县| 南漳县| 遵义县| 团风县| 新干县| 红原县| 庆云县| 上虞市| 娱乐| 永登县| 黔西县|