Painless monitoring with Scout

I write about monitoring from time to time. We host about 15 applications for our clients, so monitoring our servers is a high priority for us. We’ve experimented with a large number of tools including monit and god. Both work great as process monitors, but they aren’t built for some of our most important tasks. Enter Scout. In less then a week I’ve been able to port all of our ad-hoc monitors to Scout.

What makes Scout great is its configuration and plugin architecture. When a host runs it’s monitors, it talks to Scout to download an execution plan. This plan includes a list of plugins to execute. If a plugin has been added, the client will automatically download the required code. This makes it trivial to change monitors via the web and have your clients update automatically.

A Scout plugin is just a simple Ruby class that either returns data, sends and alert or raises an error. It provides a very simple architecture for adding new plugins. This means that unlike with monit, you can create monitors that are tailored to your environment.

Let’s look at an example. Weare heavy users of Starling. We want to be able to monitor certain queues to make sure that our processors are running. Doing this under monit was a hack. I wrote a ruby program to read the queues and touch a watchdog file when the queue depth is below a threshold. I configured monit to raise an error when a file’s timestamp was older than a set interval. This worked in practice, but it was complicated to set up and required modifying multiple pieces any time a change was required.

I was able to replace my cobbled together monitor in just a few minutes with Scout. Here is the full code to my new plugin:

class MissingLibrary < StandardError; end
class StarlingMonitor < Scout::Plugin

  
  attr_accessor :connection
  
  
  def setup_starling
    begin
      require 'starling'
    rescue LoadError
      begin
        require "rubygems"
        require 'starling'
      rescue LoadError
        raise MissingLibrary
      end
    end
    self.connection=Starling.new("#{option(:host)}:#{option(:port)}")
  end
  
    
  
  def build_report
    begin
      setup_starling
      connection.sizeof(:all).each do |queue_name,item_count|
        check_queue(queue_name,item_count) if should_check_queue?(queue_name)
      end
      report(@report)
    rescue  MissingLibrary=>e
      error("Could not load all required libraries",
            "I failed to load the starling library. Please make sure it is installed.")
    rescue Exception=>e
      error("Got unexpected error: #{e} #{e.class}")
    end
  end

  def should_check_queue?(name)
    option(:queue_re).nil? or /#{option(:queue_re)}/ =~ name
  end
  
  def check_queue(name,depth)
    q_depth = (depth||0).to_i
    @report ||= {}
    @report[name] = q_depth
    if q_depth > option(:max_depth).to_i
      alert("Max Queue Depth for #{name} exceeded","#{q_depth} items is more than the max allowed #{option(:max_depth)}")
    end
  end

end

There’s a bit of code, but it is pretty simple. Scout starts our plugin by calling the build_report method. In build_report, I do a little setup to make sure can load all of the required libraries. After requiring in Starling, I make a connection. The plugin uses the option method to pull configuration information from the environment. Scout provides a simple configuration interface that allows plugins to be configured via the web.

After loading up the environment, I walk the list of matching queues and send any necessary alerts. After looking at all queues, I return a report that maps the name of each queue to its depth. Along with providing an alert infrastructure, Scout also stores your reported data and can plot it on a graph.

This simple interface to reporting and alerting is incredibly powerful. What makes it even more powerful is the ease at which you can install plugins. If you want to install my starling plugin, you can simply enter the URL to the Ruby file from github (http://github.com/mmangino/er-scout-plugins/tree/master%2Fstarling_monitor%2Fstarling_monitor.rb?raw=true)

As I said, I’ve moved all of my ad-hoc monitoring to Scout. I had to create a few plugins to make that work. They’re available on Github at http://github.com/mmangino/er-scout-plugins/tree/master. Feel free to fork them and submit enhancements!

So far, using Scout has been a real joy. That’s something I never thought I would say about a monitoring tool.