One of my joys at work is getting to work withErlang。If adoption increases, Erlang has quite a few benefits to offer in terms of distributed computing and reliability, but in the short term Erlang has the inevitable weakness ofnot being PHP or Java。此外,Erlang应用程序可能依靠Mnesiainstead of MySQL or PostgreSQL, and the end result is that a company's existing infrastructure (ops, monitoring, runbooks, etc) usually isn't effective at supporting Erlang without some modification.
Taking a stab at one aspect of this, I spent some time over the past few dayswriting monitoring scripts for Erlang process groups, nodes and applicationsfor use withNagios。The effort is tentatively named
nagios_erlang, although I'll admit a certain weakness in its charm.
More thorough usage details are inthe nagios_erlang README, but generally it provides:
- the ability to check that the host can ping another node,
- the ability to check that a specific application is running on another node1,
- check that the number of processes in a process group satisfies warning and critical constraints (i.e. more than 5 is ok, less than 5 is warning, less than 3 is critical, etc).
At the moment they are performingactive checks, but it should be straightforward to extend the script to supportpassive checksas well. (Add a second wrapper to output in NCSA format in
nagios_erlang.erl, check for
--passiveparameter, write output to a temporary file, pipe it into NCSA
send_message; something along those lines).
Full source code is available on Github.