Icinga2 - Moteur de monitoring
Find a file
Julian Brost 3302c9b0a8 Fix that NewClientHandler() could hang indefinitely, preventing new connection attempts
There is some race condition when the `async_write()`/`async_flush()` operation
for the `icinga::Hello` message fails (connection reset by peer for example)
around the same time the connect timeout fires and calls `cancel()` on the
stream, the following call to `async_shutdown()` may block indefinitely. If
that happens, the endpoint remains in the connecting state and no new
connection attemps are initiated.

This commit fixes the issue by removing the `Defer` containing the
`async_shutdown()`. The purpose of `async_shutdown()` is to signal a clean
termination of the connection to the peer, which really isn't something that
makes sense to to in a `Defer` block that is also executed in case of errors.
For the one situation where doing a clean TLS shutdown makes some sense
(closing anonymous client connections), a call to GracefulShutdown() is added
to that specific code path.

A large part of the change is just changing the indentation of the code, given
that a now unnecessary `try`/`catch` block is removed.

The following Go code creates a TLS server that can be used to demonstrate the
issue. Note that given that a race condition is involved, this is not reliable
and the sleep duration may need some fine-tuning. For this to work,
`ApiListener.tls_handshake_timeout` needs to be set to a large-enough value
like 60s to disable the timeout for `async_handshake()` itself so that the
overall connect timeout is the one that fires. However, changing the timeout is
not a prerequisite for the problem, it just makes it easier to reproduce. The
error can also happen with the default timeouts if the TCP connect takes long
enough so that the handshake is started late enough that its timeout expires
after the connect timeout.

    package main

    import (
        "crypto/tls"
        "log"
        "net"
        "time"
    )

    func main() {
        cert, err := tls.LoadX509KeyPair("bad-agent.crt", "bad-agent.key")
        if err != nil {
            panic(err)
        }

        listener, err := tls.Listen("tcp", ":1337", &tls.Config{
            Certificates: []tls.Certificate{cert},
        })
        if err != nil {
            panic(err)
        }

        log.Println("Listening on", listener.Addr())

        for {
            conn, err := listener.Accept()
            if err != nil {
                panic(err)
            }

            go handle(conn.(*tls.Conn))
        }
    }

    func handle(conn *tls.Conn) {
        addr := conn.RemoteAddr().String()
        log.Println(addr, "new connection")

        time.Sleep(15*time.Second - 10*time.Millisecond)

        log.Println(addr, "SetLinger(0)", conn.NetConn().(*net.TCPConn).SetLinger(0))
        log.Println(addr, "Handshake()", conn.Handshake())
        log.Println(addr, "conn.NetConn().Close()", conn.NetConn().Close())
    }

With additional logging in the `catch` block for `boost::system::system_error`
and `Defer shutdownSslConn` (both removed by this commit), this showed the
following. Note that in particular, `async_shutdown()` never returned,
indicating that it hangs in there.

    [2026-04-24 17:32:56 +0200] information/ApiListener: Reconnecting to endpoint 'bad-agent' via host 'host.docker.internal' and port '1337'
    [2026-04-24 17:33:11 +0200] critical/ApiListener: Timeout while reconnecting to endpoint 'bad-agent' via host 'host.docker.internal' and port '1337', cancelling attempt
    [2026-04-24 17:33:11 +0200] information/ApiListener: New client connection for identity 'bad-agent' to [172.17.0.1]:1337
    [2026-04-24 17:33:12 +0200] information/ApiListener: rethrowing for bad-agent: Error: Connection reset by peer [system:104 at /usr/include/boost/asio/detail/reactive_socket_send_op.hpp:137 in function 'do_complete']
    [2026-04-24 17:33:12 +0200] information/ApiListener: doing async_shutdown for bad-agent
2026-05-06 09:54:21 +02:00
.github Merge pull request #10832 from Icinga/fix-openSUSE15.6-GHA 2026-05-05 08:00:14 +00:00
agent Replace all existing copyright headers with SPDX headers 2026-02-04 14:00:05 +01:00
choco Don't upload 32 bit MSIs to Chocolatey 2026-03-12 16:59:58 +01:00
cmake Replace all existing copyright headers with SPDX headers 2026-02-04 14:00:05 +01:00
doc docs: note OTLP metrics package limitation on Debian 11 and Ubuntu 22.04 and Amazon Linux 2 2026-05-05 10:13:16 +02:00
etc OTLP: Set enable_ha to true by default 2026-04-01 12:18:21 +02:00
icinga-app Add common OTel type/lib 2026-04-01 12:18:21 +02:00
icinga-installer Replace all existing copyright headers with SPDX headers 2026-02-04 14:00:05 +01:00
itl Add a check command for NETGEAR monitoring (#10753) 2026-03-31 15:15:34 +02:00
lib Fix that NewClientHandler() could hang indefinitely, preventing new connection attempts 2026-05-06 09:54:21 +02:00
plugins Replace all existing copyright headers with SPDX headers 2026-02-04 14:00:05 +01:00
test Fix PerfdataWriterConnection test-cases on parallel build 2026-04-22 08:36:28 +02:00
third-party Merge pull request #9719 from Icinga/execvp 2026-04-23 14:04:31 +02:00
tools Bump Boost shipped for Windows to v1.91 2026-04-24 11:11:11 +02:00
.gitattributes Exclude debian/ from git-archive and dist tarballs. 2013-11-11 22:03:31 +01:00
.gitignore SELinux: Let safe-reload run in icinga2_t 2020-02-27 08:45:33 +01:00
.mailmap .mailmap: Merge Alvar email addresses 2025-06-17 11:15:16 +02:00
AUTHORS docs: note OTLP metrics package limitation on Debian 11 and Ubuntu 22.04 and Amazon Linux 2 2026-05-05 10:13:16 +02:00
CHANGELOG.md Update changelog for v2.15.3 2026-04-23 11:10:27 +02:00
CMakeLists.txt Merge pull request #9719 from Icinga/execvp 2026-04-23 14:04:31 +02:00
config.h.cmake Add tests for pthread_set_name_np() and pthread_setname_np() 2025-11-27 10:48:54 +01:00
Containerfile Containerfile: install all required Protobuf libs for OTel 2026-04-01 12:18:21 +02:00
CONTRIBUTING.md GHA: complain if PR adds commits from people not yet listed in ./AUTHORS 2023-11-21 12:40:16 +01:00
icinga-spec-version.h.cmake Set versions for all internal libraries 2016-08-25 17:56:18 +02:00
icinga-version.h.cmake Fix Windows .exe version v2.12.0 -> 2.12.0 2020-09-11 15:56:51 +02:00
ICINGA2_VERSION Release v2.16.0 2026-04-21 17:20:46 +02:00
LICENSE.md Rename COPYING -> LICENSE.md & upgrade to GPLv3 2026-02-04 14:01:09 +01:00
mkdocs.yml Docs: Align local mkdocs config 2019-09-17 12:54:43 +02:00
NEWS icinga.com: Update everything else 2018-10-18 09:50:53 +02:00
publiccode.yml publiccode.yml: update to GPL 3 (or later) 2026-04-20 16:31:43 +02:00
README.md Rename COPYING -> LICENSE.md & upgrade to GPLv3 2026-02-04 14:01:09 +01:00

Github Tag

Icinga 2

Icinga Logo

Table of Contents

  1. About
  2. Installation
  3. Documentation
  4. Support
  5. License
  6. Contributing

About

Icinga is a monitoring system which checks the availability of your network resources, notifies users of outages, and generates performance data for reporting.

Scalable and extensible, Icinga can monitor large, complex environments across multiple locations.

Icinga 2 is the monitoring server and requires Icinga Web 2 on top in your Icinga Stack. The configuration can be easily managed with either the Icinga Director, config management tools or plain text within the Icinga DSL.

Icinga Dashboard

Installation

Once Icinga Server and Web are running in your distributed environment, make sure to check out the many Icinga modules for even better monitoring.

Documentation

The documentation is available on icinga.com/docs.

Support

Check the project website for status updates. Join the community channels for questions or ask an Icinga partner for professional support.

License

Icinga 2 and the Icinga 2 documentation are licensed under the terms of the GNU General Public License Version 3 or later, you will find a copy of this license in the LICENSE.md file included in the source package.

In addition, as a special exception, the copyright holders give permission to link the code of portions of this program with the OpenSSL library under certain conditions as described in each individual source file, and distribute linked combinations including the two.

You must obey the GNU General Public License in all respects for all of the code used other than OpenSSL. If you modify file(s) with this exception, you may extend this exception to your version of the file(s), but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version. If you delete this exception statement from all source files in the program, then also delete it here.

Note

Historically, Icinga 2 has been licensed under the GNU General Public License Version 2 or later. However, due to newly introduced dependencies licensed under the Apache License 2.0 (which is not compatible with GPLv2), we have decided to upgrade the license of Icinga 2 to GPLv3+, effective from version v2.16.0 onwards. All versions prior to v2.16.0 and all existing source code files remain licensed under GPLv2+ (see the license information in those files).

Also, the OpenSSL linking exception is only relevant for OpenSSL 1.x. OpenSSL >= 3.0 is licensed under the Apache License version 2.0, which is compatible with GPLv3 and no longer requires an exception.

Contributing

There are many ways to contribute to Icinga -- whether it be sending patches, testing, reporting bugs, or reviewing and updating the documentation. Every contribution is appreciated!

Please continue reading in the contributing chapter.

If you are a packager, please read the development chapter for more details.

Security Issues

For reporting security issues please visit this page.