During testing, I just encountered a race condition where my Galera
cluster was not yet ready, causing the initial schema check to fail.
```
2024-04-11T08:13:40.401Z INFO icingadb Starting Icinga DB daemon (1.1.1)
2024-04-11T08:13:40.401Z INFO icingadb Connecting to database at 'mysql:3306'
2024-04-11T08:13:40.404Z FATAL icingadb Error 1047 (08S01): WSREP has not yet prepared node for application use
can't check database schema version
github.com/icinga/icingadb/pkg/icingadb.(*DB).CheckSchema
/go/src/github.com/Icinga/icingadb/pkg/icingadb/db.go:115
main.run
/go/src/github.com/Icinga/icingadb/cmd/icingadb/main.go:74
main.main
/go/src/github.com/Icinga/icingadb/cmd/icingadb/main.go:37
runtime.main
/usr/local/go/src/runtime/proc.go:271
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1695
exit status 1
```
This change now also retries the initial cluster check.
References #698.
With the approaching release, the go version within go.mod can be
incremented to a current Go version.
There were some small changes regarding the go.mod's go semantic since
Go version 1.21, mainly enforcing compilation with a compatible
version[0]. As other modules' go.mod files are now also being included,
there are additional entries in go.sum.
[0]: https://go.dev/doc/modules/gomod-ref#go
`WithBackoff()` will now make one final retry if the timeout expires
during the sleep phase between attempts, which can be a long period
depending on the attempts made and the maximum sleep time.
This change simplifies the use of `attempt` as a number for reading in
log messages and `if`s. Also before, with `attempt` starting with `0`,
the second attempt would have been taken immediately, as our backoff
implementation returns `0` in this case.
Co-Authored-By: Alvar Penning <alvar.penning@icinga.com>
Logging of the `attempt` is a meaningless metric as it is not constantly
logged but only when the retryable error changes, and it has no context
as there is no such thing as max attempts.
The retryable function may exit prematurely due to context errors that
shouldn't be retried. Before, we checked the returned error for context
errors, i.e. used `errors.Is()` to compare it to `Canceled` and
`DeadlineExceeded` which also yields `true` for errors that implement
`Is()` accordingly. For example, this applies to some non-exported Go
`net` errors. Now we explicitly check the context error instead.
All of our error callbacks are used to log the error and indicate that
we are retrying. Previously, in the case of context errors or
non-retryable errors, we would have called these too, which would have
resulted in misleading log messages.