官术网_书友最值得收藏!

S oft and Hard States

Nagios works by checking if a particular host or service is working correctly and storing its status. Because the status of a service is only one of the four possible values, it is crucial that it actually reflects what the current status is. In order to avoid detecting random and temporary problems, Nagios uses soft and hard states to describe what the current status of a host or service is.

Imagine that an administrator is restarting a web server and this operation makes connection to the web pages unavailable for five seconds. As, usually, such restarts are done at night to lower the number of users affected, this is an acceptable period of time. However, a problem might arise when Nagios tries to connect to the server and notices that it is actually down. If it relies only on a single result, Nagios would trigger an alert that a web server is down. It would actually be up and running again in a few seconds, but it could take a couple of minutes for Nagios to find that out.

To handle situations when a service is down for a very short time, or the test has temporarily failed, soft states were introduced. When the status of a check is unknown, or it is different from the previous one, Nagios will retest the host or service several times to make sure that the change is persistent. The number of checks is specified in the host or service configuration. Nagios assumes that the new result is a soft state. After additional tests have verified that the new state is permanent, it is considered a hard state.

Each host and service definition specifies the number of retries to be performed before it can be assumed that a change is permanent. This allows more flexibility over how many failures should be treated as an actual problem instead of a temporary one. Setting the number of checks to one will cause all changes to be treated as hard instantly. The following is an illustration of soft and hard state changes, assuming that number of checks to be performed is set to three:

Sfeatures, Nagios Nagiosfeaturesoft and Hard States

This feature allows ignoring short outages of a service. It is also very useful for performing checks that can periodically fail even if everything is working correctly. Monitoring devices over SNMP is also an example where a single check might fail, but the check will eventually succeed during the second or third check.

主站蜘蛛池模板: 合江县| 寿阳县| 红安县| 百色市| 扬中市| 前郭尔| 邵阳县| 正定县| 漳平市| 西宁市| 瑞昌市| 嘉祥县| 精河县| 玉山县| 裕民县| 崇礼县| 彰武县| 上杭县| 宜春市| 汕头市| 石门县| 镇坪县| 富川| 商都县| 长岛县| 屏山县| 闵行区| 淅川县| 彭州市| 通海县| 岳阳县| 左权县| 镇巴县| 桃江县| 双鸭山市| 化德县| 渝北区| 莱西市| 乳源| 水城县| 吉林省|