diff --git a/doc/15-troubleshooting.md b/doc/15-troubleshooting.md
index a4a7b8910..f7d46f7b8 100644
--- a/doc/15-troubleshooting.md
+++ b/doc/15-troubleshooting.md
@@ -747,7 +747,7 @@ $ curl -k -s -u root:icinga -H 'Accept: application/json' -X DELETE 'https://loc
}
```
-## REST API Troubleshooting: No Objects Found
+### REST API Troubleshooting: No Objects Found
Please note that the `404` status with no objects being found can also originate
from missing or too strict object permissions for the authenticated user.
@@ -761,6 +761,93 @@ In order to analyse and fix the problem, please check the following:
- use an administrative account with full permissions to check whether the objects are actually there.
- verify the permissions on the affected ApiUser object and fix them.
+### Missing Runtime Objects (Hosts, Downtimes, etc.)
+
+Runtime objects consume the internal config packages shared with
+the REST API config packages. Each host, downtime, comment, service, etc. created
+via the REST API is stored in the `_api` package.
+
+This includes downtimes and comments, which where sometimes stored in the wrong
+directory path, because the active-stage file was empty/truncated/unreadable at
+this point.
+
+Wrong:
+
+```
+/var/lib/icinga2/api/packages/_api//conf.d/downtimes/1234-5678-9012-3456.conf
+```
+
+Correct:
+
+```
+/var/lib/icinga2/api/packages/_api/abcd-ef12-3456-7890/conf.d/downtimes/1234-5678-9012-3456.conf
+```
+
+At creation time, the object lives in memory but its storage is broken. Upon restart,
+it is missing and e.g. a missing downtime will re-enable unwanted notifications.
+
+`abcd-ef12-3456-7890` is the active stage name which wasn't correctly
+read by the Icinga daemon. This information is stored in `/var/lib/icinga2/api/packages/_api/active-stage`.
+
+2.11 now limits the direct active-stage file access (this is hidden from the user),
+and caches active stages for packages in-memory.
+
+Bonus on startup/config validation: Icinga now logs a critical message when a deployed
+config package is broken.
+
+```
+icinga2 daemon -C
+
+[2019-04-26 12:58:14 +0200] critical/ApiListener: Cannot detect active stage for package '_api'. Broken config package, check the troubleshooting documentation.
+```
+
+In order to fix the broken config package, and mark a deployed stage as active
+again, carefully do the following steps with creating a backup before:
+
+Navigate into the API package prefix.
+
+```
+cd /var/lib/icinga2/api/packages
+```
+
+Change into the broken package directory and list all directories and files
+ordered by latest changes.
+
+```
+cd _api
+ls -lahtr
+
+drwx------ 4 michi wheel 128B Mar 27 14:39 ..
+-rw-r--r-- 1 michi wheel 25B Mar 27 14:39 include.conf
+-rw-r--r-- 1 michi wheel 405B Mar 27 14:39 active.conf
+drwx------ 7 michi wheel 224B Mar 27 15:01 abcd-ef12-3456-7890
+drwx------ 5 michi wheel 160B Apr 26 12:47 .
+```
+
+As you can see, the `active-stage` file is missing. When it is there, verify that its content
+is set to the stage directory as follows.
+
+If you have more than one stage directory here, pick the latest modified
+directory. Copy the directory name `abcd-ef12-3456-7890` and
+add it into a new file `active-stage`. This can be done like this:
+
+```
+echo "abcd-ef12-3456-7890" > active-stage
+```
+
+Re-run config validation.
+
+```
+icinga2 daemon -C
+```
+
+The validation should not show an error.
+
+> **Note**
+>
+> The internal `_api` config package structure may change in the future. Do not modify
+> things in there manually or with scripts unless guided here or asked by a developer.
+
## Certificate Troubleshooting
diff --git a/doc/16-upgrading-icinga-2.md b/doc/16-upgrading-icinga-2.md
index b309e6932..f3575fdd4 100644
--- a/doc/16-upgrading-icinga-2.md
+++ b/doc/16-upgrading-icinga-2.md
@@ -109,6 +109,28 @@ The deprecated `concurrent_checks` attribute in the [checker feature](09-object-
has no effect anymore if set. Please use the [MaxConcurrentChecks](17-language-reference.md#icinga-constants-global-config)
constant in [constants.conf](04-configuring-icinga-2.md#constants-conf) instead.
+### REST API
+
+#### Config Packages
+
+Deployed configuration packages require an active stage, with many previous
+allowed. This mechanism is used by the Icinga Director as external consumer,
+and Icinga itself for storing runtime created objects inside the `_api`
+package.
+
+This includes downtimes and comments, which where sometimes stored in the wrong
+directory path, because the active-stage file was empty/truncated/unreadable at
+this point.
+
+2.11 makes this mechanism more stable and detects broken config packages.
+
+```
+[2019-04-26 12:58:14 +0200] critical/ApiListener: Cannot detect active stage for package '_api'. Broken config package, check the troubleshooting documentation.
+```
+
+In order to fix this, please follow [this troubleshooting entry](15-troubleshooting.md#troubleshooting-api-missing-runtime-objects).
+
+
## Upgrading to v2.10
### Path Constant Changes
diff --git a/lib/cli/daemoncommand.cpp b/lib/cli/daemoncommand.cpp
index c5b0376a0..adbb1afa5 100644
--- a/lib/cli/daemoncommand.cpp
+++ b/lib/cli/daemoncommand.cpp
@@ -289,7 +289,13 @@ int DaemonCommand::Run(const po::variables_map& vm, const std::vectorSet("version", object->GetVersion());
if (object->GetPackage() == "_api") {
- String file = ConfigObjectUtility::GetObjectConfigPath(object->GetReflectionType(), object->GetName());
+ String file;
+
+ try {
+ file = ConfigObjectUtility::GetObjectConfigPath(object->GetReflectionType(), object->GetName());
+ } catch (const std::exception& ex) {
+ Log(LogNotice, "ApiListener")
+ << "Cannot sync object '" << object->GetName() << "': " << ex.what();
+ return;
+ }
std::ifstream fp(file.CStr(), std::ifstream::binary);
if (!fp)
diff --git a/lib/remote/apilistener.cpp b/lib/remote/apilistener.cpp
index 39d82ef2f..f4903e6a6 100644
--- a/lib/remote/apilistener.cpp
+++ b/lib/remote/apilistener.cpp
@@ -6,6 +6,7 @@
#include "remote/endpoint.hpp"
#include "remote/jsonrpc.hpp"
#include "remote/apifunction.hpp"
+#include "remote/configpackageutility.hpp"
#include "base/convert.hpp"
#include "base/defer.hpp"
#include "base/io-engine.hpp"
@@ -134,6 +135,9 @@ void ApiListener::OnConfigLoaded()
Log(LogWarning, "ApiListener", "Please read the upgrading documentation for v2.8: https://icinga.com/docs/icinga2/latest/doc/16-upgrading-icinga-2/");
}
+ /* Cache API packages and their active stage name. */
+ UpdateActivePackageStagesCache();
+
/* set up SSL context */
std::shared_ptr cert;
try {
@@ -267,6 +271,11 @@ void ApiListener::Start(bool runtimeCreated)
m_CleanupCertificateRequestsTimer->Start();
m_CleanupCertificateRequestsTimer->Reschedule(0);
+ m_ApiPackageIntegrityTimer = new Timer();
+ m_ApiPackageIntegrityTimer->OnTimerExpired.connect(std::bind(&ApiListener::CheckApiPackageIntegrity, this));
+ m_ApiPackageIntegrityTimer->SetInterval(300);
+ m_ApiPackageIntegrityTimer->Start();
+
OnMasterChanged(true);
}
@@ -1537,6 +1546,83 @@ Endpoint::Ptr ApiListener::GetLocalEndpoint() const
return m_LocalEndpoint;
}
+void ApiListener::UpdateActivePackageStagesCache()
+{
+ boost::mutex::scoped_lock lock(m_ActivePackageStagesLock);
+
+ for (auto package : ConfigPackageUtility::GetPackages()) {
+ String activeStage;
+
+ try {
+ activeStage = ConfigPackageUtility::GetActiveStageFromFile(package);
+ } catch (const std::exception& ex) {
+ Log(LogCritical, "ApiListener")
+ << ex.what();
+ continue;
+ }
+
+ Log(LogNotice, "ApiListener")
+ << "Updating cache: Config package '" << package << "' has active stage '" << activeStage << "'.";
+
+ m_ActivePackageStages[package] = activeStage;
+ }
+}
+
+void ApiListener::CheckApiPackageIntegrity()
+{
+ boost::mutex::scoped_lock lock(m_ActivePackageStagesLock);
+
+ for (auto package : ConfigPackageUtility::GetPackages()) {
+ String activeStage;
+ try {
+ activeStage = ConfigPackageUtility::GetActiveStageFromFile(package);
+ } catch (const std::exception& ex) {
+ /* An error means that the stage is broken, try to repair it. */
+ auto it = m_ActivePackageStages.find(package);
+
+ if (it == m_ActivePackageStages.end())
+ continue;
+
+ String activeStageCached = it->second;
+
+ Log(LogInformation, "ApiListener")
+ << "Repairing broken API config package '" << package
+ << "', setting active stage '" << activeStageCached << "'.";
+
+ ConfigPackageUtility::SetActiveStageToFile(package, activeStageCached);
+ }
+ }
+}
+
+void ApiListener::SetActivePackageStage(const String& package, const String& stage)
+{
+ boost::mutex::scoped_lock lock(m_ActivePackageStagesLock);
+ m_ActivePackageStages[package] = stage;
+}
+
+String ApiListener::GetActivePackageStage(const String& package)
+{
+ boost::mutex::scoped_lock lock(m_ActivePackageStagesLock);
+
+ if (m_ActivePackageStages.find(package) == m_ActivePackageStages.end())
+ BOOST_THROW_EXCEPTION(ScriptError("Package " + package + " has no active stage."));
+
+ return m_ActivePackageStages[package];
+}
+
+void ApiListener::RemoveActivePackageStage(const String& package)
+{
+ /* This is the rare occassion when a package has been deleted. */
+ boost::mutex::scoped_lock lock(m_ActivePackageStagesLock);
+
+ auto it = m_ActivePackageStages.find(package);
+
+ if (it == m_ActivePackageStages.end())
+ return;
+
+ m_ActivePackageStages.erase(it);
+}
+
void ApiListener::ValidateTlsProtocolmin(const Lazy& lvalue, const ValidationUtils& utils)
{
ObjectImpl::ValidateTlsProtocolmin(lvalue, utils);
diff --git a/lib/remote/apilistener.hpp b/lib/remote/apilistener.hpp
index e6390d944..5818ec994 100644
--- a/lib/remote/apilistener.hpp
+++ b/lib/remote/apilistener.hpp
@@ -84,6 +84,11 @@ public:
static Value ConfigUpdateObjectAPIHandler(const MessageOrigin::Ptr& origin, const Dictionary::Ptr& params);
static Value ConfigDeleteObjectAPIHandler(const MessageOrigin::Ptr& origin, const Dictionary::Ptr& params);
+ /* API config packages */
+ void SetActivePackageStage(const String& package, const String& stage);
+ String GetActivePackageStage(const String& package);
+ void RemoveActivePackageStage(const String& package);
+
static Value HelloAPIHandler(const MessageOrigin::Ptr& origin, const Dictionary::Ptr& params);
static void UpdateObjectAuthority();
@@ -119,6 +124,8 @@ private:
Timer::Ptr m_ReconnectTimer;
Timer::Ptr m_AuthorityTimer;
Timer::Ptr m_CleanupCertificateRequestsTimer;
+ Timer::Ptr m_ApiPackageIntegrityTimer;
+
Endpoint::Ptr m_LocalEndpoint;
static ApiListener::Ptr m_Instance;
@@ -126,6 +133,7 @@ private:
void ApiTimerHandler();
void ApiReconnectTimerHandler();
void CleanupCertificateRequestsTimerHandler();
+ void CheckApiPackageIntegrity();
bool AddListener(const String& node, const String& service);
void AddConnection(const Endpoint::Ptr& endpoint);
@@ -175,6 +183,12 @@ private:
void SendRuntimeConfigObjects(const JsonRpcConnection::Ptr& aclient);
void SyncClient(const JsonRpcConnection::Ptr& aclient, const Endpoint::Ptr& endpoint, bool needSync);
+
+ /* API Config Packages */
+ mutable boost::mutex m_ActivePackageStagesLock;
+ std::map m_ActivePackageStages;
+
+ void UpdateActivePackageStagesCache();
};
}
diff --git a/lib/remote/configpackageutility.cpp b/lib/remote/configpackageutility.cpp
index fad266b53..b56bbb9fc 100644
--- a/lib/remote/configpackageutility.cpp
+++ b/lib/remote/configpackageutility.cpp
@@ -1,6 +1,7 @@
/* Icinga 2 | (c) 2012 Icinga GmbH | GPLv2+ */
#include "remote/configpackageutility.hpp"
+#include "remote/apilistener.hpp"
#include "base/application.hpp"
#include "base/exception.hpp"
#include "base/utility.hpp"
@@ -34,6 +35,14 @@ void ConfigPackageUtility::DeletePackage(const String& name)
if (!Utility::PathExists(path))
BOOST_THROW_EXCEPTION(std::invalid_argument("Package does not exist."));
+ ApiListener::Ptr listener = ApiListener::GetInstance();
+
+ /* config packages without API make no sense. */
+ if (!listener)
+ BOOST_THROW_EXCEPTION(std::invalid_argument("No ApiListener instance configured."));
+
+ listener->RemoveActivePackageStage(name);
+
Utility::RemoveDirRecursive(path);
Application::RequestRestart();
}
@@ -157,10 +166,7 @@ void ConfigPackageUtility::WriteStageConfig(const String& packageName, const Str
void ConfigPackageUtility::ActivateStage(const String& packageName, const String& stageName)
{
- String activeStagePath = GetPackageDir() + "/" + packageName + "/active-stage";
- std::ofstream fpActiveStage(activeStagePath.CStr(), std::ofstream::out | std::ostream::binary | std::ostream::trunc);
- fpActiveStage << stageName;
- fpActiveStage.close();
+ SetActiveStage(packageName, stageName);
WritePackageConfig(packageName);
}
@@ -242,8 +248,11 @@ std::vector ConfigPackageUtility::GetStages(const String& packageName)
return stages;
}
-String ConfigPackageUtility::GetActiveStage(const String& packageName)
+String ConfigPackageUtility::GetActiveStageFromFile(const String& packageName)
{
+ /* Lock the transaction, reading this only happens on startup or when something really is broken. */
+ boost::mutex::scoped_lock lock(GetStaticMutex());
+
String path = GetPackageDir() + "/" + packageName + "/active-stage";
std::ifstream fp;
@@ -255,11 +264,62 @@ String ConfigPackageUtility::GetActiveStage(const String& packageName)
fp.close();
if (fp.fail())
- return "";
+ BOOST_THROW_EXCEPTION(std::invalid_argument("Cannot detect active stage for package '" + packageName + "'. Broken config package, check the troubleshooting documentation."));
return stage.Trim();
}
+void ConfigPackageUtility::SetActiveStageToFile(const String& packageName, const String& stageName)
+{
+ boost::mutex::scoped_lock lock(GetStaticMutex());
+
+ String activeStagePath = GetPackageDir() + "/" + packageName + "/active-stage";
+
+ std::ofstream fpActiveStage(activeStagePath.CStr(), std::ofstream::out | std::ostream::binary | std::ostream::trunc); //TODO: fstream exceptions
+ fpActiveStage << stageName;
+ fpActiveStage.close();
+}
+
+String ConfigPackageUtility::GetActiveStage(const String& packageName)
+{
+ ApiListener::Ptr listener = ApiListener::GetInstance();
+
+ /* config packages without API make no sense. */
+ if (!listener)
+ BOOST_THROW_EXCEPTION(std::invalid_argument("No ApiListener instance configured."));
+
+ String activeStage;
+
+ /* First use runtime state. */
+ try {
+ activeStage = listener->GetActivePackageStage(packageName);
+ } catch (const std::exception& ex) {
+ /* Fallback to reading the file, happens on restarts. */
+ activeStage = GetActiveStageFromFile(packageName);
+
+ /* When we've read something, correct memory. */
+ if (!activeStage.IsEmpty())
+ listener->SetActivePackageStage(packageName, activeStage);
+ else
+ BOOST_THROW_EXCEPTION(std::invalid_argument("Cannot detect active stage for package '" + packageName + "'. Broken config package, check the troubleshooting documentation."));
+ }
+
+ return activeStage;
+}
+
+void ConfigPackageUtility::SetActiveStage(const String& packageName, const String& stageName)
+{
+ ApiListener::Ptr listener = ApiListener::GetInstance();
+
+ /* config packages without API make no sense. */
+ if (!listener)
+ BOOST_THROW_EXCEPTION(std::invalid_argument("No ApiListener instance configured."));
+
+ listener->SetActivePackageStage(packageName, stageName);
+
+ /* Also update the marker on disk for restarts. */
+ SetActiveStageToFile(packageName, stageName);
+}
std::vector > ConfigPackageUtility::GetFiles(const String& packageName, const String& stageName)
{
diff --git a/lib/remote/configpackageutility.hpp b/lib/remote/configpackageutility.hpp
index db475c8f2..11b2b9977 100644
--- a/lib/remote/configpackageutility.hpp
+++ b/lib/remote/configpackageutility.hpp
@@ -32,7 +32,10 @@ public:
static String CreateStage(const String& packageName, const Dictionary::Ptr& files = nullptr);
static void DeleteStage(const String& packageName, const String& stageName);
static std::vector GetStages(const String& packageName);
+ static String GetActiveStageFromFile(const String& packageName);
static String GetActiveStage(const String& packageName);
+ static void SetActiveStage(const String& packageName, const String& stageName);
+ static void SetActiveStageToFile(const String& packageName, const String& stageName);
static void ActivateStage(const String& packageName, const String& stageName);
static void AsyncTryActivateStage(const String& packageName, const String& stageName, bool reload);