1 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
2 |
%% |
3 |
%% This default.mgp is "TrueType fonts" oriented. |
4 |
%% First, you should create "~/.mgprc" whose contents are: |
5 |
%% tfdir "/path/to/truetype/fonts" |
6 |
%% |
7 |
%% To visualize English, install "times.ttf", "arial.ttf", and "cour.ttf" |
8 |
%% into the "tfdir" directory above: |
9 |
%% http://microsoft.com/typography/fontpack/default.htm |
10 |
%% |
11 |
%% To visualize Japanese, install "MSMINCHO.ttf" and |
12 |
%% "watanabenabe-mincho.ttf" into the "tfdir" directory above: |
13 |
%% http://www.mew.org/mgp/xtt-fonts_0.19981020-3.tar.gz |
14 |
%% |
15 |
%deffont "standard" xfont "helvetica-medium-r", tfont "standard.ttf", tmfont "hoso6.ttf" |
16 |
%deffont "thick" xfont "helvetica-bold-r", tfont "thick.ttf", tmfont "hoso6.ttf" |
17 |
%deffont "typewriter" xfont "courier-medium-r", tfont "typewriter.ttf", tmfont "hoso6.ttf" |
18 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
19 |
%% |
20 |
%% Default settings per each line numbers. |
21 |
%% |
22 |
%default 1 area 90 90, leftfill, size 2, fore "gray20", back "white", font "standard", hgap 0 |
23 |
%default 2 size 7, vgap 10, prefix " ", ccolor "black" |
24 |
%default 3 size 2, bar "gray70", vgap 10 |
25 |
%default 4 size 5, fore "gray20", vgap 30, prefix " ", font "standard" |
26 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
27 |
%% |
28 |
%% Default settings that are applied to TAB-indented lines. |
29 |
%% |
30 |
%tab 1 size 5, vgap 40, prefix " ", icon box "green" 50 |
31 |
%tab 2 size 4, vgap 40, prefix " ", icon arc "yellow" 50 |
32 |
%tab 3 size 3, vgap 40, prefix " ", icon delta3 "white" 40 |
33 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
34 |
%page |
35 |
%nodefault |
36 |
%size 6.5, font "standard", back "white", ccolor "black" |
37 |
|
38 |
|
39 |
|
40 |
|
41 |
|
42 |
%center, fore "Blue", font "standard", hgap 60, size 6.5 |
43 |
An introduction to Nagios |
44 |
%bar "skyblue" 6 15 70 |
45 |
%font "standard", hgap 0 |
46 |
%image "logofullsize.png" |
47 |
|
48 |
|
49 |
%size 5, fore "darkblue" |
50 |
Andrew Pollock |
51 |
%size 4.5 |
52 |
me@andrew.net.au |
53 |
|
54 |
%size 1.5 |
55 |
Nagios and Nagios logo are registered trademarks of Ethan Galstad |
56 |
|
57 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
58 |
%page |
59 |
%bgrad 0 0 256 0 0 "white" "blue" |
60 |
|
61 |
Overview |
62 |
|
63 |
Theory of operation |
64 |
Configuration files |
65 |
Plugins |
66 |
|
67 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
68 |
%page |
69 |
%bgrad 0 0 256 0 0 "white" "blue" |
70 |
|
71 |
Theory of operation |
72 |
|
73 |
Central monitoring server |
74 |
Hard and soft service statuses |
75 |
Service check scheduling |
76 |
Dependencies |
77 |
|
78 |
|
79 |
|
80 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
81 |
%page |
82 |
|
83 |
Nagios just schedules plugin execution |
84 |
|
85 |
%CENTER |
86 |
%IMAGE "plugintheory.png" |
87 |
|
88 |
It has no real idea of what you're monitoring, just the results of the plugin |
89 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
90 |
%page |
91 |
%bgrad 0 0 256 0 0 "white" "blue" |
92 |
|
93 |
State types |
94 |
|
95 |
The current state of a service or host check is determined by the status of the check and the type the state is in. |
96 |
Soft states |
97 |
Check result is non-OK and not yet been rerun the configured number of times for it to go into a hard state |
98 |
If the check recovers before going to the hard state it is a soft recovery |
99 |
Completely optional (set max_check_attempts to 0) |
100 |
Hard states |
101 |
Repeated (max_check_attempts) checks have returned non-OK |
102 |
If the check recovers, it is a hard recovery |
103 |
|
104 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
105 |
%page |
106 |
%bgrad 0 0 256 0 0 "white" "blue" |
107 |
|
108 |
Scheduling |
109 |
|
110 |
Service checks can be individually scheduled |
111 |
Frequency of execution |
112 |
Frequency and quantity of rechecks on failure |
113 |
|
114 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
115 |
%page |
116 |
%bgrad 0 0 256 0 0 "white" "blue" |
117 |
|
118 |
Dependencies |
119 |
|
120 |
Hosts have a concept of parents |
121 |
If a host goes down, any other hosts that have it as a parent are considered unreachable and therefore in an unknown state. |
122 |
|
123 |
Proper configuration of hierarchy will allow for a reasonable logical diagram to be automatically constructed |
124 |
|
125 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
126 |
%page |
127 |
|
128 |
Hierarchy example |
129 |
|
130 |
%CENTER |
131 |
%IMAGE "physical-network.png" |
132 |
|
133 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
134 |
%PAGE |
135 |
%bgrad 0 0 256 0 0 "white" "blue" |
136 |
|
137 |
Configuration files |
138 |
|
139 |
nagios.cfg |
140 |
Main configuration file |
141 |
"Includes" addition configuration files to provide logical separation |
142 |
hosts.cfg |
143 |
Defines your hosts |
144 |
services.cfg |
145 |
Defines the services checked |
146 |
commands.cfg |
147 |
Defines what a command actually is |
148 |
resource.cfg |
149 |
Defines macros/variables you can use in other config files |
150 |
|
151 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
152 |
%PAGE |
153 |
%bgrad 0 0 256 0 0 "white" "blue" |
154 |
|
155 |
Configuration files |
156 |
|
157 |
You can end up with too much logical separation. |
158 |
|
159 |
There's no reason not to merge hosts.cfg and services.cfg if it makes sense in your environment. |
160 |
|
161 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
162 |
%PAGE |
163 |
%bgrad 0 0 256 0 0 "white" "blue" |
164 |
|
165 |
A walk through nagios.cfg |
166 |
|
167 |
log_file |
168 |
Specifies where the main logfile should go |
169 |
cfg_file |
170 |
"Includes" additional files as configuration files |
171 |
cfg_file=/opt/local/nagios/etc/hosts.cfg |
172 |
cfg_file=/opt/local/nagios/etc/services.cfg |
173 |
cfg_file=/opt/local/nagios/etc/commands.cfg |
174 |
This is where you can merge files if it all gets too distributed |
175 |
cfg_dir |
176 |
Will read all .cfg files as config files as per above |
177 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
178 |
%PAGE |
179 |
%bgrad 0 0 256 0 0 "white" "blue" |
180 |
|
181 |
nagios.cfg (continued) |
182 |
|
183 |
resource_file |
184 |
Defines macros/variables |
185 |
Macros are not expanded when the configuration is viewed via the web interface |
186 |
Allows you to protect sensitive values like passwords |
187 |
Can be used multiple times to specify multiple files |
188 |
host_check_timeout & service_check_timeout |
189 |
Global setting for maximum number of seconds a plugin execution can run for before it considered to have timed out |
190 |
Can be adjusted on a per-service basis |
191 |
|
192 |
Plenty of other options you're not likely to need to tweak. See the documentation for more information. |
193 |
|
194 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
195 |
%PAGE |
196 |
%bgrad 0 0 256 0 0 "white" "blue" |
197 |
|
198 |
Object configuration files |
199 |
|
200 |
What is object data? |
201 |
Services |
202 |
Hosts |
203 |
Host groups |
204 |
Contacts |
205 |
Contact groups |
206 |
Commands |
207 |
Time Periods |
208 |
Service Escalations |
209 |
Service Dependencies |
210 |
Host Escalations |
211 |
Host Dependencies |
212 |
Hostgroup Escalations |
213 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
214 |
%PAGE |
215 |
%bgrad 0 0 256 0 0 "white" "blue" |
216 |
|
217 |
Object configuration files (continued) |
218 |
|
219 |
Has a template based system that supports inheritance |
220 |
Makes "carbon-copy" services easy to maintain |
221 |
Reduces the size and complexity of your configuration files |
222 |
|
223 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
224 |
%PAGE |
225 |
%bgrad 0 0 256 0 0 "white" "blue" |
226 |
|
227 |
Host Definitions |
228 |
|
229 |
Used to define a physical device on the network |
230 |
A host has services |
231 |
%size 2 |
232 |
define host{ |
233 |
%fore "red" |
234 |
host_name host_name |
235 |
alias alias |
236 |
address address |
237 |
%fore "black" |
238 |
parents host_names |
239 |
check_command command_name |
240 |
%fore "red" |
241 |
max_check_attempts # |
242 |
%fore "black" |
243 |
checks_enabled [0/1] |
244 |
event_handler command_name |
245 |
event_handler_enabled [0/1] |
246 |
low_flap_threshold # |
247 |
high_flap_threshold # |
248 |
flap_detection_enabled [0/1] |
249 |
process_perf_data [0/1] |
250 |
retain_status_information [0/1] |
251 |
retain_nonstatus_information [0/1] |
252 |
%fore "red" |
253 |
notification_interval # |
254 |
notification_period timeperiod_name |
255 |
notification_options [d,u,r] |
256 |
%fore "black" |
257 |
notifications_enabled [0/1] |
258 |
stalking_options [o,d,u] |
259 |
} |
260 |
|
261 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
262 |
%PAGE |
263 |
%bgrad 0 0 256 0 0 "white" "blue" |
264 |
|
265 |
Service Definitions |
266 |
|
267 |
A service runs on a host |
268 |
Can be a network service or any other metric you want to associate with a host |
269 |
|
270 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
271 |
%PAGE |
272 |
%bgrad 0 0 256 0 0 "white" "blue" |
273 |
|
274 |
Service Definitions (continued) |
275 |
|
276 |
%size 2 |
277 |
define service{ |
278 |
%fore "red" |
279 |
host_name host_name |
280 |
service_description service_description |
281 |
%fore "black" |
282 |
is_volatile [0/1] |
283 |
%fore "red" |
284 |
check_command command_name |
285 |
max_check_attempts # |
286 |
normal_check_interval # |
287 |
retry_check_interval # |
288 |
%fore "black" |
289 |
active_checks_enabled [0/1] |
290 |
passive_checks_enabled [0/1] |
291 |
%fore "red" |
292 |
check_period timeperiod_name |
293 |
%fore "black" |
294 |
parallelize_check [0/1] |
295 |
obsess_over_service [0/1] |
296 |
check_freshness [0/1] |
297 |
freshness_threshold # |
298 |
event_handler command_name |
299 |
event_handler_enabled [0/1] |
300 |
low_flap_threshold # |
301 |
high_flap_threshold # |
302 |
flap_detection_enabled [0/1] |
303 |
process_perf_data [0/1] |
304 |
retain_status_information [0/1] |
305 |
retain_nonstatus_information [0/1] |
306 |
%fore "red" |
307 |
notification_interval # |
308 |
notification_period timeperiod_name |
309 |
notification_options [w,u,c,r] |
310 |
%fore "black" |
311 |
notifications_enabled [0/1] |
312 |
%fore "red" |
313 |
contact_groups contact_groups |
314 |
%fore "black" |
315 |
stalking_options [o,w,u,c] |
316 |
} |
317 |
|
318 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
319 |
%PAGE |
320 |
%bgrad 0 0 256 0 0 "white" "blue" |
321 |
|
322 |
Command Definitions |
323 |
|
324 |
Maps a command name within Nagios to an operating system command to execute |
325 |
Can use macros defined in resources.cfg as well as special context sensitive variables |
326 |
|
327 |
%size 2 |
328 |
define command{ |
329 |
%fore "red" |
330 |
command_name command_name |
331 |
command_line command_line |
332 |
%fore "black" |
333 |
} |
334 |
|
335 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
336 |
%PAGE |
337 |
%bgrad 0 0 256 0 0 "white" "blue" |
338 |
|
339 |
Templates allow inheritance |
340 |
|
341 |
They simplify what you need to specify in an inherited object |
342 |
They allow for default values to specified once, globally, but override on a case-by-case basis if required |
343 |
|
344 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
345 |
%PAGE |
346 |
%bgrad 0 0 256 0 0 "white" "blue" |
347 |
|
348 |
Inheritance example |
349 |
|
350 |
%size 2 |
351 |
define host{ |
352 |
check_command check-host-alive |
353 |
notification_options d,u,r |
354 |
max_check_attempts 5 |
355 |
%fore "red" |
356 |
name generichosttemplate |
357 |
register 0 |
358 |
%fore "black" |
359 |
} |
360 |
|
361 |
define host{ |
362 |
host_name bighost1 |
363 |
address 192.168.1.3 |
364 |
%fore "red" |
365 |
use generichosthosttemplate |
366 |
%fore "black" |
367 |
} |
368 |
|
369 |
define host{ |
370 |
host_name bighost2 |
371 |
address 192.168.1.4 |
372 |
%fore "red" |
373 |
use generichosthosttemplate |
374 |
%fore "black" |
375 |
} |
376 |
|
377 |
|
378 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
379 |
%PAGE |
380 |
%bgrad 0 0 256 0 0 "white" "blue" |
381 |
|
382 |
Saving time |
383 |
|
384 |
Same service, multiple hosts: |
385 |
|
386 |
%size 2 |
387 |
define service{ |
388 |
%fore "red" |
389 |
host_name HOST1,HOST2,HOST3 |
390 |
service_description SOMESERVICEICOOKEDUP |
391 |
%fore "black" |
392 |
... |
393 |
} |
394 |
|
395 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
396 |
%PAGE |
397 |
%bgrad 0 0 256 0 0 "white" "blue" |
398 |
|
399 |
Service checks - active vs passive |
400 |
|
401 |
Active checks |
402 |
Scheduled and executed by Nagios |
403 |
Passive checks |
404 |
Run externally to Nagios and submit results back |
405 |
Can use the freshness threshold to ensure results are being submitted frequently enough |
406 |
|
407 |
Generally checks are active, but under certain circumstances you have to use passive checks (e.g. Nagios can't access the service itself, but it can tell Nagios how it's going) |
408 |
|
409 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
410 |
%PAGE |
411 |
%bgrad 0 0 256 0 0 "white" "blue" |
412 |
|
413 |
Plugins |
414 |
|
415 |
%size 4 |
416 |
Provide the actual monitoring power of Nagios |
417 |
%size 4 |
418 |
Can do whatever you want |
419 |
%size 4 |
420 |
Write them in whatever takes your fancy |
421 |
%size 4 |
422 |
All they have to do is return one line of output (on STDOUT) and an return code ($?) |
423 |
%size 4 |
424 |
0 = OK |
425 |
%size 4 |
426 |
1 = WARNING |
427 |
%size 4 |
428 |
2 = CRITICAL |
429 |
%size 4 |
430 |
3 (or anything else) = UNKNOWN |
431 |
%size 4 |
432 |
Nagios comes with a whole stack of standard plugins that will check most of the things you'd want to check |
433 |
%size 4 |
434 |
Normally run by the nagios user, so run unprivileged |
435 |
%size 4 |
436 |
Run locally directly by the nagios process or remotely usually by the check_by_ssh plugin |
437 |
|
438 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
439 |
%PAGE |
440 |
%bgrad 0 0 256 0 0 "white" "blue" |
441 |
|
442 |
Plugin examples |
443 |
|
444 |
%size 4 |
445 |
\#!/bin/sh |
446 |
|
447 |
echo "All is quiet on the western front" |
448 |
exit 0 |
449 |
|
450 |
\#!/usr/bin/perl |
451 |
|
452 |
print "Everything's busted!\n"; |
453 |
exit(2); |
454 |
|
455 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
456 |
%PAGE |
457 |
%bgrad 0 0 256 0 0 "white" "blue" |
458 |
|
459 |
Further reading |
460 |
|
461 |
http://www.nagios.org |
462 |
http://nagiosplug.sourceforge.net |
463 |
|
464 |
Questions? |