官术网_书友最值得收藏!

S oft and Hard States

Nagios works by checking if a particular host or service is working correctly and storing its status. Because the status of a service is only one of the four possible values, it is crucial that it actually reflects what the current status is. In order to avoid detecting random and temporary problems, Nagios uses soft and hard states to describe what the current status of a host or service is.

Imagine that an administrator is restarting a web server and this operation makes connection to the web pages unavailable for five seconds. As, usually, such restarts are done at night to lower the number of users affected, this is an acceptable period of time. However, a problem might arise when Nagios tries to connect to the server and notices that it is actually down. If it relies only on a single result, Nagios would trigger an alert that a web server is down. It would actually be up and running again in a few seconds, but it could take a couple of minutes for Nagios to find that out.

To handle situations when a service is down for a very short time, or the test has temporarily failed, soft states were introduced. When the status of a check is unknown, or it is different from the previous one, Nagios will retest the host or service several times to make sure that the change is persistent. The number of checks is specified in the host or service configuration. Nagios assumes that the new result is a soft state. After additional tests have verified that the new state is permanent, it is considered a hard state.

Each host and service definition specifies the number of retries to be performed before it can be assumed that a change is permanent. This allows more flexibility over how many failures should be treated as an actual problem instead of a temporary one. Setting the number of checks to one will cause all changes to be treated as hard instantly. The following is an illustration of soft and hard state changes, assuming that number of checks to be performed is set to three:

Sfeatures, Nagios Nagiosfeaturesoft and Hard States

This feature allows ignoring short outages of a service. It is also very useful for performing checks that can periodically fail even if everything is working correctly. Monitoring devices over SNMP is also an example where a single check might fail, but the check will eventually succeed during the second or third check.

主站蜘蛛池模板: 贡觉县| 眉山市| 奉贤区| 新竹市| 濮阳市| 哈巴河县| 堆龙德庆县| 澎湖县| 冕宁县| 潞城市| 鱼台县| 富锦市| 黄龙县| 铁力市| 吴川市| 若羌县| 焉耆| 米易县| 秦皇岛市| 望江县| 南川市| 昌黎县| 淮滨县| 昭平县| 昂仁县| 墨玉县| 太仆寺旗| 庆安县| 姜堰市| 敖汉旗| 长垣县| 黑龙江省| 佳木斯市| 德昌县| 封开县| 依安县| 多伦县| 社会| 中卫市| 开江县| 怀化市|