After trying several network monitoring apps, I am currently using LibreNMS. I just installed Nagios Core (and I am open to upgrading to the CSP for unlimited). Before diving into the CSP unlimited, I want to get feedback to ensure it will do what I need with the least manual configuration possible. I have a small department, three techs, including me. We have 14 interconnected locations via VPN using Sonicwalls, Netgear, Grand Stream switches, and about 500 Windows workstations and servers. As I am trying the Core, I notice all the configurations, including adding devices, are done manually. If I upgrade to CSP, will the configuration be the same, all manual-based via command?
Hey RudyM88, with CSP you will have a GUI and wizards you can run through to add these hosts and services so you wont have to touch the CLI.
Hi everyone,
I’ve been having serious issues with our Nagios Core setup and would really appreciate any help.
About a month ago, there was a power outage at the location hosting our core devices. After the power was restored, Nagios started behaving abnormally: The duration column began showing "???"
Host statuses were incorrect, some hosts would respond to a ping test, but Nagios would still show them as DOWN (and vice versa)
Availability reports were also incorrect. Oddly, service checks seemed mostly fine
We tried force re-scheduling checks for affected hosts. In some cases, that fixed the issue, but many durations and statuses remained wrong. So, we started deleting the configuration files for each problematic link and re-imported them via our NagiosQL admin interface. That seemed to help, the duration and host status were correct for those we updated.
However, we hadn’t finished this process for all hosts before another power outage occurred last week. After that second outage:
The same issues returned: wrong durations, incorrect host statuses, very strange values (e.g., 4020d...)in duration fields
Eventually, the Nagios web UI failed to load, so we had to reboot the core device
After rebooting, the UI came back up, but the problems persisted
Some durations now show numbers, but many are still "???" and still incorrect. Forcing re-scheduled checks don't update anything anymore
Right now, Nagios has become almost unusable for us.
I was able to access the Linux root console on the server, but I don’t have advanced Linux or MySQL knowledge, so I’m not sure how to proceed. I suspect something may be wrong with runtime/status files or retained state
Hello @osegie.ly.
If this is related to core please post to this forum - https://support.nagios.com/forum/viewforum.php?f=7
By default Nagios core does not use a database.
If there has been a database added and their was a power outage then most likely you have data corruption which is causing the abnormal behavior you are seeing.
To get further further assistance please post to the appropriate forum as mentioned above
Thank you!
You are very welcome! Thank you for reaching out to us. Let us know if there is anything else we can help with.