官术网_书友最值得收藏!

  • Monitoring with Opsview
  • Alan Wijntje
  • 1372字
  • 2021-07-21 18:06:22

Configuring service checks and service groups

Now that we have seen how to add hosts and create and assign host templates, we need to look at the service checks that will actually perform the checks we need.

Creating service checks

To create a service check (or edit or clone an existing one), simply go to settings | Advanced | Service Checks and you will get a list of all the available checks.

Let's create a new check by clicking on the add icon, which will open the New Service Check page as shown in the following screenshot:

Creating service checks

After naming and describing the check, the Service Group field is the most important one to look at. You can either select one that is already defined or create a new one. Note that the service group is used only within the host template's monitors tab for ordering and finding the checks easily.

The next important selection is the type of check (similar to Nagios). The following screenshot shows the three basic types of checks from which we can choose:

Creating service checks

The type simply tells Opsview how it is supposed to operate the check by either executing it (Active Plugin), or by waiting for another program to send in its status (Passive).

SNMP Polling is a special type that allows you to create SNMP-based service checks in a fast and easy manner.

Active Plugin

When we select Active Plugin, we need to select a plugin from the Plugin dropdown and possibly add the arguments to be used (in the Arguments field) by the plugin on execution.

The following screenshot shows the Plugin and Arguments section of the New Service Check screen.

Active Plugin

To understand what a plugin does, simply select the plugin you want from the drop-down menu and click on Show Plugin Help to display the help page.

Most plugins use the arguments given by the user in the Arguments field to give the specific information you want to retrieve and the thresholds it should apply to the results.

As Opsview uses Nagios as its engine, it allows the use of macros to make configurations more generic. One of the most common macros is $HOSTADDRESS$, which represents the host's primary hostname / IP (we discussed this in the Hosts section). Click on Show Macro Help to get a list of available macros.

SNMP polling

For those familiar with SNMP, using it to retrieve information can be a challenge as you would need to know the exact Object Identifier (OID) to be retrieved.

Opsview allows you to easily create SNMP checks by letting you scan an example host and returning all possible OIDs ready for use.

To use this feature, you will first need to configure your host to allow SNMP from the Opsview system. Then you need to configure SNMP in the host configuration. Test the connection using Test SNMP connection.

The following screenshot is an example of a host that is configured to use SNMP with the community set to public and after the Test SNMP connection option was run successfully:

SNMP polling

When creating a new SNMP service, enter the name of your host in the Example Host field and click on SNMP Walk to scan the host.

The following screenshot shows the result of an SNMP Walk on a network device:

SNMP polling

Once the SNMP Walk is completed, simply enter the OID you are interested in, add a label, and set some thresholds by using the numeric or string comparisons.

Passive

A passive check is not executed by Opsview, but is expected to deliver results to the Nagios engine using, most commonly, the NSCA interface. An example would be a scheduled backup reporting its success or failure to Opsview or an application that sends out warnings and so on.

Using dependencies

Dependencies are another important part of the service check configuration. These allow us to build relations between checks running on the same host.

As an example, let's consider a database server running MySQL and two service checks to monitor the server. The first check monitors if the MySQL process is running, while the second check verifies whether TCP port 3306 is accepting connections.

If the MySQL process stops running, then we know our check on TCP port 3306 will also fail (but not vice versa). So, we can consider TCP port 3306 to be dependent on the MySQL process.

The following screenshot shows the dependency for the TCP port 3306.

Using dependencies

When Opsview detects a dependency failure, instead of marking a service as CRITICAL it will mark the service as UNKNOWN, with the following description: Dependency failure.

The UNKNOWN state is particularly useful for notifications. Filtering out UNKNOWN will prevent the instance flooding of the system administrator by sending him only the most relevant notifications. This allows us to quickly scan a host to find the main issue (root cause analysis). All other possible states are discussed in the Configuring notifications section in Chapter 3, Advanced Configuration.

It is highly recommended to use dependencies when you create your checks, as they will help you narrow down issues faster and more efficiently.

Adding plugins to the system

Even though Opsview comes with a huge amount of plugins already installed, you might want to add some new ones. This could be a plugin you created yourself, one you found on a website, or a plugin provided to you by suppliers or vendors. A great source of plugins is the Nagios Exchange (http://exchange.nagios.org).

The following steps explain how to add a new plugin and will make sure the plugin operates correctly in Opsview.

  1. Download the plugin to your Opsview system using Wget or scp.
  2. Make sure the plugin runs. You can check this by logging in to the Opsview system via SSH and running su - nagios to become the Nagios user before we can start testing the plugin.

    Another thing to check is whether the plugin has a help function. This is used by Opsview to fill the help page in the service check screen. It can be checked by running the plugin with the -h option.

    Note

    When testing plugins, always perform this as the Nagios user and from the Nagios user's home directory. This will prevent things from going wrong later on.

  3. Once tested, simply copy the plugin (as the Nagios user, so the plugin will have the correct permissions) to the libexec directory under /usr/local/nagios of the Opsview system.

    Opsview keeps track of the libexec directory, but not the subdirectories. When it detects a change, it will rescan the directory for any new plugins and add them to the database, after which you can select the plugin from the drop-down menu.

    This usually takes less than a minute; running the populate_db.pl command as the Nagios user will also update the system.

Any plugin that was created in accordance with the Nagios developer's guidelines (https://www.nagios-plugins.org/doc/guidelines.html) will work in Opsview.

Handling performance data

Performance data can give you valuable, long-term views of how a service is performing. A simple example of this is the response times of a web server, too slow and customers will leave our website (and visit a competitor instead); so keeping track of this is paramount.

Any plugin that returns performance data (not all do) will be detected and performance graphs will be automatically made. As shown in the following screenshot, a graph icon will be added to the service check in the monitoring screen to show that performance data is available:

Handling performance data

Note

Please note that the graph icon will become available after Opsview has received at least one result containing performance data and has been reloaded after receiving them. This will update the web interface.

Clicking on the icon will take you to a graph like the one shown in the following screenshot, which shows the memory utilization of the Opsview host.

Handling performance data

The image is dynamic, so you can select a time period from the timeline at the bottom, or you can zoom in to a specific time period in the graph.

In the graph, you can add additional sources that need to be incorporated into the graph. The graph will reset to its default when you leave the graph page. So, have a look at the various options and the export function.

主站蜘蛛池模板: 沂源县| 九龙坡区| 秦皇岛市| 安达市| 兰溪市| 大同县| 漾濞| 句容市| 宜丰县| 封丘县| 合江县| 沙田区| 江口县| 海丰县| 方正县| 区。| 潞西市| 廉江市| 舒城县| 容城县| 浮山县| 龙山县| 兴安盟| 习水县| 呼和浩特市| 达日县| 云梦县| 嵩明县| 深州市| 灌阳县| 迁安市| 郎溪县| 彭泽县| 郴州市| 南充市| 虎林市| 六安市| 监利县| 黄龙县| 石河子市| 深泽县|