The original problem was https://github.com/monitoring-plugins/monitoring-plugins/pull/1705
where the performance data output of check_swap did not conform to
the parser logic of a monitoring system (which decided to go for
"correct" SI or IEC units.
The PR was accompanied by a change to byte values in the performance
data which broke the _perfdata_ helper function which could not handle
values of this size.
The fix for this, was to use _fperfdata_ which could, but would
use float values.
I didn't like that (since all values here are discreet) and this
is my proposal for a fix for the problem.
It introduces some helper functions which do now explicitely work
with (u)int64_t, including a special version of the _perfdata_ helper.
In the process of introducing this to check_swap, I stumbled over
several sections of the check_swap code which I found problematic.
Therefore I tried to simplify the code and make it more readable
and less redundant.
I am kinda sorry about this, but sincerely hope my changes can
be helpful.
check_http closes the connection after checking the certificate with -C. This leads to sigpipe
errors when the ssl daemon wants to send a response and the daemon quits which makes the
subsequent tests fail.
* Set correct amount of tests based on conditionals.
* When running the test as non-root, we would previously check is the
setuid bit is set. This doesn't seem to be needed, so just check if the
binary is executable for the user running the test.
* Use cmp_ok to check if tests succeeds rather than couting.
Signed-off-by: Jacob Hansen <jhansen@op5.com>
Set correct number of tests in skip- blocks to avoid the error "Bad
plan. You planned 50 tests but ran 55" when run with/without
/usr/bin/faketime and NP_INTERNET_ACCESS=yes/no.
check_curl crashes when a (broken) http server returns invalid http header with
leading spaces or double colons. This PR adds a fix and a test case for this.
Signed-off-by: Sven Nierlein <sven@nierlein.de>
As strcpy may overflow the resulting buffer:
flo@p5:~$ /tmp/f/usr/lib/nagios/plugins/check_pgsql -d "$(seq 1 10000)"
*** buffer overflow detected ***: terminated
Aborted
I would propose to change the code rather like this, using snprintf
which honors the buffers size and guarantees null termination.
the certificate used to test expired http checks is to old to be used
with recent ssl libraries and results in:
> SSL routines:SSL_CTX_use_certificate:ee key too small
unfortunatly the error is only visible when setting $IO::Socket::SSL::DEBUG in
the check_http.t file.
There are different declarations for timeout_interval:
lib/utils_base.c has the definition:
unsigned int timeout_interval = DEFAULT_SOCKET_TIMEOUT;
lib/utils_base.h has the appropiate declaration:
extern unsigned int timeout_interval;
plugins/popen.h has an extra declaration:
extern unsigned int timeout_interval;
This doesn't hurt, but it's a dupe. The one in utils_base.h
should be enough, so remove this one.
plugins/popen.c has a WRONG one:
extern int timeout_interval;
Remove it!
Use #include "utils.h" to get the right one.
This makes the local defines for max/min unnecassary, so
remove them also.
If _SC_OPEN_MAX is available then maxfd was zero initialized and never set to the value from sysconf.
This leads to segfaults with free(): invalid size introduced by commit 7cafb0e845.
Signed-off-by: Sven Nierlein <sven@nierlein.de>
The help text says that -H accepts a "unix socket (must be an absolute
path)". Now that actually corresponds to reality.
Signed-off-by: Robin Sonefors <robin.sonefors@op5.com>
When check_by_ssh runs into a timeout it simply exits keeping all child processes running.
Simply adopting the kill loop from runcmd_timeout_alarm_handler() fixes this.
Signed-off-by: Sven Nierlein <sven@nierlein.de>
The check_snmp rate tests depend on the exact amount of time spend between the
plugin runs and will fail on busy machines, ex. the ci servers. Using faketime
mitigates this issue and also removes all the sleeps.
Signed-off-by: Sven Nierlein <sven@nierlein.de>
When SSL is enabled, n is assigned the size of the server's second EHLO
response (I think in bytes), which will usually be significantly higher
than the command passed. As such, no commands are executed and no responses
are checked, which - silently - defeats the desired checks and results in a
success value.
there were 2 variants of calling getTestParameter:
- parameter, description, default value
- parameter, env value, default value, description, scope
While scope was never actually used and having 2 names for the same value led
to having 2 different entries in the cache file for the same configuration.
This commit removes the variants and simplifies tests parameters by only using
the first 3 parameter variant.