<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Vladimir Vuksan&#039;s blog &#187; Monitoring</title>
	<atom:link href="http://blog.vuksan.com/category/monitoring/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.vuksan.com</link>
	<description>Documenting the systems and network infrastructure madness</description>
	<lastBuildDate>Tue, 03 Jan 2012 03:50:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Use fantomTest to test web pages from multiple locations</title>
		<link>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/</link>
		<comments>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/#comments</comments>
		<pubDate>Tue, 27 Sep 2011 23:39:49 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=472</guid>
		<description><![CDATA[In my previous I introduced Testing your web pages with fantomtest. I have recently added ability to test the same page from multiple sites within the same interface. You simply install the copy of fantomTest on a remote site then configure your primary site to access it. For example this is a test of Google [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous I introduced <a href="http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/">Testing your web pages with fantomtest</a>. I have recently added ability to test the same page from multiple sites within the same interface. You simply install the copy of fantomTest on a remote site then configure your primary site to access it. For example this is a test of Google from my laptop.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-goog.png"><img class="alignnone size-full wp-image-473" title="FantomTest Google " src="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-goog.png" alt="" width="959" height="500" /></a></p>
<p>Looks like my network connection is really slow <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> . Changing the testing site to Croatia where I have a server I get</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-hr.png"><img class="alignnone size-full wp-image-474" title="FantomTest Google Croatia" src="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-hr.png" alt="" width="980" height="477" /></a></p>
<p>Slightly different since Google redirects me to their localized Google site however it leads me to believe that it's my connection that is slow not Google.</p>
<p>Any number of  "remotes" can be added. Want it ? Get it @GitHub</p>
<p><a href="https://github.com/vvuksan/fantomtest">https://github.com/vvuksan/fantomtest</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing your web pages with fantomtest</title>
		<link>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/</link>
		<comments>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 13:28:32 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=452</guid>
		<description><![CDATA[Coming from web operations background my web site/page monitoring had largely focused at looking at metrics such as average request duration, 90th percentile request duration etc. These are all great metrics however through Velocity Conferences I have come to appreciate that there is a lot more to web performance than simply knowing how long it [...]]]></description>
			<content:encoded><![CDATA[<p>Coming from web operations background my web site/page monitoring had largely focused at looking at metrics such as average request duration, 90th percentile request duration etc. These are all great metrics however through <a href="http://velocityconf.com/">Velocity Conferences</a> I have come to appreciate that there is a lot more to web performance than simply knowing how long it takes to load HTML in a web page. As a result I have been looking for ways to try to get better metrics by utilizing real browsers instead of Perl/Ruby/Python scripts. For some time I have been playing with Selenium RC to give me an easy way to test and time my web application. Unfortunately I found it heavy and slow. At last Velocity conference I was fortunate enough to see a demo of <a href="http://phantomjs.org/">PhantomJS</a>. PhantomJS is a semi-headless webkit browser with Javascript support. What I really appreciated about it is that it is light weight, fast and very easy to instrument using Javascript. In addition it includes a number of useful examples such as netsniff.js which output a HTTP Archive (HAR) of requests to a certain web page. From a HAR file you can builds among other things waterfall charts. There are a number of services you can use to have your site tested for free e.g. <a href="http://webpagetest.org/">webpagetest.org</a>. Limitation is that they can't test your intranet infrastructure since that is usually behind a firewall or it doesn't allow you to test remote sites that are connected to your intranet via a VPN.</p>
<p>That is why I'm introducing fantomTest. A simple web application that allows you to generate waterfall graphs using PhantomJS. It will also take a screenshot of a rendered page. Here is what that looks like</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/08/screenshot2.png"><img class="alignnone size-large wp-image-458" title="Google Waterfall Chart" src="http://blog.vuksan.com/wp-content/uploads/2011/08/screenshot2-1024x322.png" alt="" width="1024" height="322" /></a></p>
<p>What's interesting in this particular case is that Google is not utilizing web performance recommendations by using a HTTP redirect from google.com to www.google.com.</p>
<p>Anyways to get fantomTest go to</p>
<p><a href="https://github.com/vvuksan/fantomtest">https://github.com/vvuksan/fantomtest</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Use your trending data for alerting</title>
		<link>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/</link>
		<comments>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/#comments</comments>
		<pubDate>Tue, 19 Apr 2011 19:59:49 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Nagios]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=440</guid>
		<description><![CDATA[This post will deal with helping you use the data you already have to do alerting. It is most helpful for people running Nagios or it's variants such as Icinga, Netreo etc. It could likely be used with other decoupled alerting systems (not Zabbix or Zenoss though since they do their own trending). Recently I [...]]]></description>
			<content:encoded><![CDATA[<p>This post will deal with helping you use the data you already have to do alerting. It is most helpful for people running Nagios or it's variants such as Icinga, Netreo etc. It could likely be used with other decoupled alerting systems (not Zabbix or Zenoss though since they do their own trending).</p>
<p>Recently I came to a realization that lots of sysadmins are unaware that they could easily use trending data they already capture with systems such as Ganglia, Graphite, Collectd, Munin etc. to do alerting. Standard way of doing health checks of remote nodes in Nagios is to install the <a href="http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE-%252D-Nagios-Remote-Plugin-Executor/details">Nagios Remote Plugin Executor aka. NRPE</a> which allows you to execute Nagios plugins on remote nodes and pipe output to the Nagios server. NRPE does the job however has three major disadvantages</p>
<ol>
<li>It is another daemon that needs to run on the remote host possibly introducing security concerns</li>
<li>Depending on the load of the machine can be slow thus bogging down the Nagios server</li>
<li>Last and most important is that commonly it's used to alert on common metrics such as disk, load, CPU, swap which you should be trending anyways.</li>
</ol>
<p>Instead what you ought to be doing is use trending data for alerting. I can think of at least 4 reasons to do so</p>
<ol>
<li>You may already be collecting pertinent data ie. system load, swap, CPU utilization</li>
<li>If you are alerting on a particular metric you should likely be trending it</li>
<li>It's fast</li>
<li>Allows you to do more sophisticated checks easily ie. alert me if more than 5 hosts have a load greater than 5 etc.</li>
</ol>
<p>Years ago I used Ganglia Web PHP code to write my own generic <a href="http://vuksan.com/linux/nagios_scripts.html#check_ganglia_metrics">Nagios Ganglia plugin</a>. This has served me well. Most recently <a href="Michael Conigliaro">Michael Conigliaro</a> rewrote the script in Python making it more versatile and more powerful. You can download it from here</p>
<p><a href="https://github.com/ganglia/ganglia_contrib/tree/master/nagios">https://github.com/ganglia/ganglia_contrib/tree/master/nagios</a></p>
<p>In a nutshell what it does is download the whole metrics tree ie. list of all hosts with their associated metrics. Caches it for a configurable amount of time then uses <a href="http://packages.python.org/NagAconda/plugin.html">NagAconda</a> to support all the threshold reporting as defined in <a href="http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT">Nagios developer guidelines</a>.</p>
<p>Another alternative if you have a very large site is Ganglios which was opensourced by guys at Linden Lab. Their problem is/was that they have thousands of hosts and downloading the whole metrics tree takes ~15 seconds so they have separated the logic that downloads the metric tree and one that does alerting. You can download Ganglios from</p>
<p><a href="https://bitbucket.org/maplebed/ganglios">https://bitbucket.org/maplebed/ganglios</a></p>
<p>This can easily be adapted to work with your trending system of choice.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>JSON representation for graphs in Ganglia</title>
		<link>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/</link>
		<comments>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/#comments</comments>
		<pubDate>Mon, 21 Feb 2011 02:38:46 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=431</guid>
		<description><![CDATA[Recently thanks to work done by Alex Dean aka. @mostlyalex Ganglia UI supports defining custom graphs using JSON. Prior to this only way to create custom graphs was by writing custom PHP code. This has two major problems ie. lots of people are not comfortable writing or modifying PHP code and second you have to [...]]]></description>
			<content:encoded><![CDATA[<p>Recently thanks to work done by Alex Dean aka. <a href="https://twitter.com/mostlyalex">@mostlyalex</a> Ganglia UI supports defining custom graphs using JSON. Prior to this only way to create custom graphs was by writing custom PHP code. This has two major problems ie. lots of people are not comfortable writing or modifying PHP code and second you have to target a particular graphing engine e.g. rrdtool. As I have written in the past we are gonna be supporting both rrdtool and graphite for graphing so having a common way to describe graphs has been one of our goals.</p>
<p>To describe a custom graph you would create a JSON file similar to this one</p>
<pre>{
 "report_name" : "network_report",
 "report_type" : "standard",
 "title" : "Network Report",
 "vertical_label" : "Bytes/sec",
 "series" : [
 { "metric": "bytes_in", "color": "33cc33", "label": "In", "line_width": "2", "type": "line" },
 { "metric": "bytes_out", "color": "5555cc", "label": "Out", "line_width": "2", "type": "line" }
 ]
}</pre>
<p>This will create a line graph with bytes_in and bytes_out metrics. Since hostname and cluster are not specified it is assumed that we want metrics for the current host we are viewing. You could however specify a particular host and metric you want to graph by adding hostname and cluster attributes to series ie.</p>
<pre>{
 "report_name" : "our_load_report",
 "report_type" : "standard",
 "title" : "Load Report vs. Database Load",
 "vertical_label" : "Loads",
 "series" : [
 { "metric": "load_one", "color": "3333bb", "label": "Load 1", "line_width": "2", "type": "line" },
 { "hostname": "db1.domain.com", "clustername": "Databases", "metric": "load_one", "color": "44ddbb", "label": "DB1 Load 1", "line_width": "2", "type": "line" },
 ]
}</pre>
<p>To use the reports all you have to do is put the report in the $GANGLIA_WEB_ROOT/graph.d directory. Name them something_report.json and it will be available for any host in the cluster. There is one important thing to note. By default graphing function will look for PHP definitions for graphs as those in theory provide more power and flexibility and if those are not available use JSON definition.</p>
<h3>Types of graphs</h3>
<p>Currently both line and stacked graphs are supported. Look in graph.d/ directory for additional examples.</p>
<h3>Future</h3>
<p>I am particularly excited about this feature as it allows us to define <a href="http://blog.vuksan.com/2010/06/05/beauty-of-aggregate-line-graphs/">aggregate graphs</a> easily. There is even an alpha implementation of functionality which would allow you to specify a metric and a regex host entry and you would end up with an aggregate graph <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h3>Download location</h3>
<p>Latest version of the UI can be downloaded either from <a href="http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/">Ganglia Monitor Web 2.0 SVN branch</a> or you can get it on <a href="https://github.com/vvuksan/ganglia-misc">Github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Misconceptions about RRD storage</title>
		<link>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/</link>
		<comments>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/#comments</comments>
		<pubDate>Tue, 14 Dec 2010 22:04:57 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Systems Management]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ganglia rrd monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=424</guid>
		<description><![CDATA[I want to address the misconceptions about RRD (Round-Robin Database) that seem to crop up often even among seasoned sysadmins. Complaints can be summarized with these two points RRD doesn't offer high resolution ie. after about an hour it's all averages and I want to knows what was the metric value last year at this [...]]]></description>
			<content:encoded><![CDATA[<p>I want to address the misconceptions about RRD (Round-Robin Database) that seem to crop up often even among seasoned sysadmins. Complaints can be summarized with these two points</p>
<ul>
<li>RRD doesn't offer high resolution ie. after about an hour it's all averages and I want to knows what was the metric value last year at this hour and minute</li>
<li>Data drops off/is destroyed after a year - I want to keep my data forever, disk is cheap etc.</li>
</ul>
<p>Those are valid points however none of them are the fault of RRD. RRD is a circular buffer so in order to be able to write into it you have to precreate it (otherwise it wouldn't be a circular buffer <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ). Obviously more data points you store bigger the RRD file will be. To illustrate the point <a href="http://ganglia.info/">Ganglia Monitoring</a> uses following defaults to create RRDs</p>
<p>RRAs "RRA:AVERAGE:0.5:1:244" "RRA:AVERAGE:0.5:24:244" "RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:244" "RRA:AVERAGE:0.5:5760:374"</p>
<p>This will create multiple circular buffers within the same RRD database file. In order to make sense out of this you need to know what the polling interval is ie. how often do you write into RRDs. In Ganglia's case the default is 15 seconds so</p>
<ul>
<li>"RRA:AVERAGE:0.5:1:244" says write actual values (:1:) for every polling interval. Save last 244 of those so in our case we'll have 61 minutes worth of actual data points. Since it's a circular buffer data older than 61 minutes will be "dropped"</li>
<li>"RRA:AVERAGE:0.5:24:244" says average 24 values (:24:),  24 * 15 seconds = 360 seconds = 6 minutes. 244 of those times 6 is a whole day</li>
<li>You can do the next two <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
<li>Last one  "RRA:AVERAGE:0.5:5760:374" says average whole day (5760 * 15 seconds = 1440 minutes = 1 day) worth of values and store it in 374 points ie. little more than a year</li>
</ul>
<p>When graphing RRDtool is smart enough to use the buffer which gives you the most data points. To store all this data RRD file will use about 12kBytes. Thus if you want higher resolution you will need to change the definition e.g. you could do this</p>
<p>"RRA:AVERAGE:0.5:1:2137440"</p>
<p>which will give you one year worth of data points with no averaging with 15 second interval. Trouble is the size of this RRD file is 17 Mbytes. <span style="text-decoration: line-through;">This may not seem as bad but one of the RRD drawbacks is that every time you add data to an RRD the whole file is written over so if you have 1000 metrics you can be potentially writing 17 GBs of data every 15 seconds</span>. This may be a problem depending how many metrics you are keeping track of. There are alternatives which increase throughput such as storing RRDs in RAMdisk or using rrdcached. Alternatively you can opt to keep 2 weeks worth of data points with e.g.</p>
<p>"RRA:AVERAGE:0.5:1:81984"</p>
<p>which will result in size of about 650 kBytes per RRD file. Or you can do something else altogether. Flip side of RRD is that there are no indexes to maintain, no tables that need to be rotated.</p>
<p><strong>Update:</strong> I was wrong about the whole RRD file needing to be updated. In retrospect it makes sense and I apologize for providing the wrong info. You can read comment from Tobi Oetiker (creator of rrdtool) in comments below for more detail. This is actually awesome news since there is very little downside in making larger RRDs.</p>
<p>As far as Ganglia you can modify the defaults in /etc/ganglia/gmetad.conf file. You can also use gmetad-python which allows you to write your own plugins and store metric data in both RRD format, SQL or any other storage engine of your choice.</p>
<p>More on RRDtool can be found here</p>
<p><a href="http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html">http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Rethinking Ganglia Web UI</title>
		<link>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/</link>
		<comments>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/#comments</comments>
		<pubDate>Sat, 11 Dec 2010 01:09:47 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Systems Management]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=401</guid>
		<description><![CDATA[I have been a long time fan of Ganglia. Ganglia is a scalable distributed monitoring system initially developed for high-performance computing systems such as clusters and Grids. Today Ganglia is being used by some of the largest web properties such as Facebook, Twitter, Etsy, etc. as well as tons of smaller organizations. Some of Ganglia [...]]]></description>
			<content:encoded><![CDATA[<p>I have been a long time fan of Ganglia. Ganglia is a scalable distributed monitoring system initially developed for high-performance computing systems such as clusters and Grids. Today Ganglia is being used by some of the largest web properties such as Facebook, Twitter, Etsy, etc. as well as tons of smaller organizations. Some of Ganglia benefits are</p>
<ul>
<li>Push based metrics ie. a lightweight agent on hosts that need to be monitored</li>
<li>Lots of basic metrics by default such as load, cpu utilization, memory utilization</li>
<li>Trivial to add new metrics ie. execute the gmetric command with metric value and graph automatically shows up</li>
<li>Decent web interface that allows you to easily drill down when troubleshooting problems</li>
</ul>
<p>I have used other monitoring systems such as Cacti, Zenoss and Zabbix and found them lacking since they were overly complicated, hard to configure and customize. That said I have also had misgivings about certain parts of the Ganglia UI. Specifically what I missed were following features</p>
<ol>
<li>Ability to search hosts and metrics - looking for specific host or metric gets cumbersome even on clusters with 20-30 hosts</li>
<li>Ability to create arbitrary groupings of host metrics on one page ie. a page with web response time for each web server and mySQL lock time would be something you'd have to write custom code for</li>
<li>Easy way to create custom graphs ie. either aggregate line graphs or stacked graphs</li>
<li>Easy way to add custom graphs to either clusters or hosts ie. I have a stacked Apache report showing number of GETs vs. POSTs. It's hard or impossible to show that graph only on webservers but not on mySQL servers.</li>
<li>Mobile (WebKit) optimized experience - minimize zooming/panning etc.</li>
</ol>
<p>Couple months ago on #ganglia Freenode IRC channel we were discussing some of the pitfalls of the UI and the idea of rewriting Ganglia UI was born. As I have been doing quite a bit of work with jQuery in months past I decided to to give it a shot.</p>
<h2>Goals</h2>
<p>My initial goals were</p>
<ol>
<li>Implement basic search functionality ie. one search term that will show matching hosts and metrics</li>
<li>Add a way to add "optional" graphs on per cluster/per host basis ie. have a default set of graphs and allow those to be overriden using cluster or host override config files</li>
<li>Add Views ie. ability to group host/metrics</li>
<li>Add Mobile/Webkit View</li>
<li>Store view and optional graphs config information in a format that can be easily manipulated by web UI, config management system or by hand - this is one of the key omissions in most monitoring setups where adding/removing hosts requires either manual intervention or kludgy hacks. As someone who has had to spend hours manually clicking around Zabbix interface whenever we added a new server this had major importance</li>
</ol>
<h2>Implementation</h2>
<p>Initially there was an idea to rebuild the whole interface from scratch which we still may do but I decided that that would be too much work especially since I wasn't absolutely sure whether my intended changes would make sense for most people. Thus I decided to modify the existing UI.</p>
<p>So far these are the features that have been implemented</p>
<h3>Visual aides</h3>
<p>In cluster view next to each host now you'll see the full hostname in text on top of the graph. Same goes for metric names  in host view. Now even if you have hundreds of metrics you can click CTRL-F in your browser and find the metric quickly. Also there is a hidden anchor next to each metric which is used by the search tab.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_visual_aides1.png"><img title="Ganglia Visual Aides - Cluster View" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_visual_aides1.png" alt="" width="604" height="289" /></a></p>
<p>Doesn't seem like much until you need it <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h3>Search</h3>
<p>Search tab allows you to type in a single term which will match hosts and metrics. It will search as you type. Hosts first, metrics on host second. Clicking on hosts opens a new window with the view of the host. Clicking on a particular metric takes you to the metric in question.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_search.png"><img class="alignnone size-full wp-image-415" title="Ganglia Search Results" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_search.png" alt="" width="375" height="493" /></a></p>
<h3>Views</h3>
<p>Views are defined using JSON configuration files. One JSON file per view. There are two types of views, standard and regex views. For example standard view will look like this</p>
<pre>{  "view_name":"default",
   "items":[
      {"hostname":"host1.domain.com","graph":"cpu_report"},
      {"hostname":"host2.domain.com","graph":"apache_report"}
    ],
    "view_type":"standard"
}</pre>
<p>It will group cpu report from host1 and apache_report for host2. Regex view allows you to use regular expressions to define hosts (soon also metrics) ie. you want to group all hosts that have imap, amavis or smtp in their names. That view definition would look something like this</p>
<pre>{  "view_name":"mailservers",
    "items":[
      {"hostname":"(imap|amavis|smtp)", "graph":"cpu_report"}
    ],
    "view_type":"regex"}</pre>
<p>If you don't want to edit JSON config files by hand you can use the UI to create standard views ie. first create a view then as you browse hosts there is a plus sign next to each graph. Clicking on it displays a dialog which allows you to add that particular host/metric to a view e.g.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_add_metric_to_view.png"><img class="alignnone size-full wp-image-416" title="Add metric to a Ganglia view" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_add_metric_to_view.png" alt="" width="529" height="300" /></a></p>
<h3>Automatic rotation</h3>
<p>Allows you to automatically rotate a view. It is an integration of <a href="http://blog.vuksan.com/2010/06/16/gangliaview-automatically-rotate-ganglia-metrics/">GangliaView</a> with Views. What's especially nice is that if you have multiple monitors you can open up separate browser windows and select different views to rotate.</p>
<h3>Mobile view</h3>
<p>There is a functional mobile view which provides mobile view of Views, Clusters and Search ie. there is very little panning or zooming. Also we are using lots of preloading ie. first page you open contains lots of hidden sub-pages in order to save on having to do subsequent requests.</p>
<p>You can view some of the <a href="http://www.flickr.com/photos/51166390@N05/sets/72157625551485278/">screenshots </a>on Flickr.</p>
<h3>Optional Graphs</h3>
<p>You can specify which optional graphs you want displayed for each host or cluster. Similar to views these are configured via JSON config files e.g. this is the default list of graphs</p>
<pre>{
	"included_reports": ["load_report","mem_report","cpu_report","network_report","packet_report"]
}</pre>
<p>You can exclude any of the default included graphs or include ones you want e.g.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_edit_optional_graphs.png"><img class="alignnone size-full wp-image-418" title="Ganglia Edit Optional Graphs" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_edit_optional_graphs.png" alt="" width="557" height="421" /></a></p>
<h3>Screencast</h3>
<p>If you would like to see some of these features in action you can look at <a href="http://vuksan.com/ganglia-ui.html">these screencasts</a>.</p>
<h3>Download</h3>
<p>Ready to try ? Wait no more and check it out from SVN at</p>
<p><a href="http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/">http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/ </a></p>
<p><strong>Future</strong></p>
<p>In the future we are looking into polishing the Graphite/Ganglia integration (perhaps about that in a next post), add integrations with e.g. Nagios (you can see a hint of it in the add metric to view screenshot above), Logstash. Also another upcoming feature will be aggregate metrics and quick views. Full TODO list can be found here</p>
<p><a href="http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-web-2.0/TODO">http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-web-2.0/TODO</a></p>
<h3>Acknowledgements</h3>
<p>I'd like to thank Erik Kastner for helping on the Graphite/Ganglia integration. Ben Hartshorne for test driving the UI and providing a number of good suggestions/ideas.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Integrating Graphite with Ganglia</title>
		<link>http://blog.vuksan.com/2010/09/29/integrating-graphite-with-ganglia/</link>
		<comments>http://blog.vuksan.com/2010/09/29/integrating-graphite-with-ganglia/#comments</comments>
		<pubDate>Wed, 29 Sep 2010 14:32:56 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=380</guid>
		<description><![CDATA[Some time ago I saw a demo on using Graphite (http://graphite.wikidot.com/). I was impressed by the ease of creating custom graphs and the quality/visual appeal of the graphs. Trouble was that Graphite uses it's own storage engine instead of RRD and I figured it may be too much work to figure out how to inject my [...]]]></description>
			<content:encoded><![CDATA[<p>Some time ago I saw a demo on using Graphite (<a href="http://graphite.wikidot.com/">http://graphite.wikidot.com/</a>). I was impressed by the ease of creating custom graphs and the quality/visual appeal of the graphs. Trouble was that Graphite uses it's own storage engine instead of RRD and I figured it may be too much work to figure out how to inject my existing <a href="http://ganglia.info/">Ganglia</a> metrics.</p>
<p>Couple days ago I saw a tweet from <a href="http://twitter.com/mikebrittain">Mike Brittain</a> at Etsy on how Graphite is becoming one of his favorite graphing tools. I know that they use Ganglia at Etsy so I asked if/how they use integration between Graphite and Ganglia. He pointed me in the direction of <a href="http://twitter.com/kastner">Erik Kastner</a> who has done Ganglia Graphite integration. I asked him if he could post the patches and he was gracious to do so. In a nutshell he uses RRD files directly and rsyncs them every few minutes. While trying to install Graphite I realized that injecting metrics into Graphite is really simple. For example graphite-web contains a simple client example that injects system load. All it does is connects to port 2003 of the graphite installation and sends a following payload</p>
<pre>system.loadavg_1min 0.08 1285763852
system.loadavg_5min 0.02 1285763852
system.loadavg_15min 0.01 1285763852</pre>
<p>That's simple <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  ie. some type of a metric name, value and what looks like current UNIX timestamp. I then remembered that <a href="https://twitter.com/georgiou">Kostas Georgiou</a> showed me a ruby script that connects to gmond, retrieves the XML for the host, parses it and adds to <a href="http://www.puppetlabs.com/puppet/related-projects/facter/">Facter</a>. Unfortunately that didn't seem to have much value until now <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . What I did  is change Kostas' script to send metrics to Graphite instead of adding them to facter. You can find the result at <a href="http://github.com/ganglia/ganglia_contrib/tree/master/graphite_integration/">Ganglia Add-Ons GitHub repository</a>. You can run the script either from cron or as a daemon.</p>
<p>There are two ways to do this. I have tested only the first way. I am not sure if the graphite receiver would freak out if it gets too many metrics in a payload. Let me know if you know <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<p>1. Run this script on every host that runs gmond. This may be somewhat tricky since I usually set up gmond to only send metrics and turn off receiving by setting deaf = yes. For this approach to work you have to turn on receiving. To make it more secure we'll just listen on loopback. In global make sure you have these settings</p>
<pre>  mute = no
  deaf = no</pre>
<div>In the rest of the section make sure you add/have</div>
<pre>udp_send_channel {
  host = 127.0.0.1
  port = 8649
  ttl = 1
}
udp_recv_channel {
 bind = 127.0.0.1
 port = 8649
}
tcp_accept_channel {
   bind = 127.0.0.1
   port = 8649
}</pre>
<p>2. Run this on the main gmond collector daemon. Main gmond collector daemon will have metrics from all hosts. Trouble is that I haven't tested injecting thousands of metrics in a single payload. I'm sure there is a way around it and perhaps someone can post a patch <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> .</p>
<h3>Future Improvements</h3>
<p>I can think of couple possible improvements</p>
<ol>
<li>There is a rewrite of gmetad written in Python. It supports plugins. I don't think it would be a stretch to add a plug-in where gmetad sends data to Graphite when it updates the RRDs</li>
<li>Currently metrics are sent as &lt;hostname&gt;.&lt;metric_name&gt;. It may make sense to send them into the appropriate part of the tree ie. &lt;type_of_metric&gt;.&lt;hostname&gt;.&lt;metric_name&gt; e.g. database.web1.mysql_selects</li>
<li>Better integrate Ganglia Web UI and Graphite. Graphite supports flexible URL parameters so this should be doable.</li>
</ol>
<p>And obligatory screenshots. This is the stacked graph I created in 20 seconds <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/09/graphite1.png"><img class="alignright size-full wp-image-384" title="graphite view" src="http://blog.vuksan.com/wp-content/uploads/2010/09/graphite1.png" alt="Graphite view of Ganglia Metrics" width="800" height="498" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/09/29/integrating-graphite-with-ganglia/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Analyzing your backend web page response times</title>
		<link>http://blog.vuksan.com/2010/07/15/analyzing-your-web-page-response-times/</link>
		<comments>http://blog.vuksan.com/2010/07/15/analyzing-your-web-page-response-times/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 00:59:05 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[web performance optimization]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=253</guid>
		<description><![CDATA[I have blogged about in the past about some of the ways you can monitor your web site performance e.g how to monitor your site using 90th percentile response times, beauty of aggregate line graphs and tracking web clients in real time. Most recently we wanted to get better insight into how our site and [...]]]></description>
			<content:encoded><![CDATA[<p>I have blogged about in the past about some of the ways you can monitor your web site performance e.g how to <a href="http://blog.vuksan.com/2010/01/15/monitoring-your-site-via-90th-percentile-response-time/">monitor your site using 90th percentile response times</a>, <a href="http://blog.vuksan.com/2010/06/05/beauty-of-aggregate-line-graphs/">beauty of aggregate line graphs</a> and <a href="http://blog.vuksan.com/2010/04/20/tracking-web-clients-in-real-time/">tracking web clients in real time</a>.</p>
<p>Most recently we wanted to get better insight into how our site and more specifically backend is performing. We wanted a tool that could provide us with per URL/page metrics such as</p>
<ul>
<li>total number of requests</li>
<li>aggregate compute time</li>
<li>average request time</li>
<li>90th percentile time (you can find more explanation what it means at <a href="http://blog.vuksan.com/2010/01/15/monitoring-your-site-via-90th-percentile-response-time/">monitor your site using 90th percentile response times</a>) - this eliminates most of the really slow response times that may really affect your averages</li>
</ul>
<p>Initial plan was to build a basic set of reports to tell us what are the pages with excessive response times or large total (aggregate) compute times. Next and yet to be implemented portion was to be able to analyze data in real time so that we'd have another data point to use in troubleshooting in case there is a site slow down.</p>
<p>Basic requirements for the tool were these</p>
<ul>
<li> Capable of crunching 100+ million daily entries</li>
<li>Real-time analysis</li>
<li>Produce multiple metrics with potential to add more down the line</li>
<li>Low footprint</li>
</ul>
<p>An obvious way to do this is to store all data in a heavy duty data store like a relational/SQL database or something MapReduce capable. Trouble is we may be doing in logging in excess of 3,000 hits per second (all dynamic content as static assets are served from the CDN). Doing that many inserts per second on a SQL-type database will be tricky unless you have powerful hardware. Next obvious problem is to scan through hundreds of millions or billions of rows will be slow even if I use MapReduce unless of course you throw tons of hardware at it. We wanted a low footprint remember.</p>
<p>Instead we decided to go with a key/value store. Major pluses were that footprint is relatively low and it performs very fast. Downside was I would not be able to run any sophisticated queries. Since we already have an app that uses memcached to give us <a href="http://blog.vuksan.com/2010/04/20/tracking-web-clients-in-real-time/">real-time view per IP number of accesses</a> we ended up using it for this purpose as well.</p>
<h3>Implementation</h3>
<p>I have been working for a while now with <a href="http://bitbucket.org/maplebed/ganglia-logtailer/">ganglia-logtailer </a>which is a Python framework to crunch log data and submit it to <a href="http://ganglia.info/">Ganglia</a>. There are a number of good pieces from it we could reuse and we did. What we ended up is a two part tool. A Python based log parsing piece and a PHP based web GUI and computation part. Division of "labor" was roughly this</p>
<ul>
<li>Python part parses the logs and creates entries/keys where the value in each key represent all the response times observed on a particular server and URL in a particular time period ie. one hour</li>
<li>PHP part takes the list once the time period has ended, calculates total time, average time and 90th percentile times and stores computed values in memcache so that retrieval later can be quicker.</li>
</ul>
<p>Graphing is achieved using simple CSS graphs while time based series are done using <a href="http://sourceforge.net/projects/openflashchart">OpenFlashChart</a>. I did look at <a href="http://www.danvk.org/dygraphs/">Dygraphs </a>for Javascript/DHTML based graphing however couldn't figure how to plot hourly values. I could only do daily values.</p>
<p>Tool is operational and so far it has led us to the realization that our mobile web pages are overall much slower than their corresponding web pages. This is due to the way we handle mobile ads since most feature phones don't support Javascript so we have to download the ad which introduces a slight delay. We did figure out that we could use Javascript on Webkit browsers similar to what we do for regular browsers so that should help a bit. We are also chasing some of the other "leads" regarding inconsistent performance for particular pages on some of the servers.</p>
<p>Next steps are to adapt parsing code to work with ganglia-logtailer which would give us real-time reporting. I don't expect too many problems with that. Also graphing could use some more love. Perhaps I'll even do standard deviation calculations <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<p>Anyways you can download source code from here</p>
<p><a href="http://github.com/vvuksan/pagetime-analyzer">http://github.com/vvuksan/pagetime-analyzer</a></p>
<p>You know what to do <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h2>Obligatory screenshots</h2>
<p>Hourly overview sorted by aggregate time in seconds (you can sort by any column)</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_overview.png"><img title="PageTime Analyzer hourly overview" src="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_overview.png" alt="" width="700" /></a></p>
<p>This is the average response time (over an hour) for a particular URL on separate server instances</p>
<p style="text-align: center;"><a href="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_url_breakdown.png"><img class="size-full wp-image-257  aligncenter" title="PageTime Analyzer URL server breakdown" src="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_url_breakdown.png" alt="" width="626" height="405" /></a></p>
<p>Daily view of performance for a particular URL</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_graph.png"><img class="alignright size-full wp-image-261" title="PageTime Analyzer average/90th percentile graph" src="http://blog.vuksan.com/wp-content/uploads/2010/07/pt_graph.png" alt="" width="713" height="363" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/07/15/analyzing-your-web-page-response-times/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Store your cron output for analysis and correlation with cronologger</title>
		<link>http://blog.vuksan.com/2010/07/06/store-your-cron-output-with-cronologger/</link>
		<comments>http://blog.vuksan.com/2010/07/06/store-your-cron-output-with-cronologger/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 12:32:41 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Cron Linux CouchDB]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=244</guid>
		<description><![CDATA[For the longest time I have wanted to get rid of dozen or so cron messages I receive every morning about things like DB backups, DB cleanups/vacuums, reporting etc. There are a number of solutions out there to help you manage the cron spam such as cronic, shush and cronwrap. They help by e-mailing you [...]]]></description>
			<content:encoded><![CDATA[<p>For the longest time I have wanted to get rid of dozen or so cron messages I receive every morning about things like DB backups, DB cleanups/vacuums, reporting etc. There are a number of solutions out there to help you manage the cron spam such as <a href="http://habilis.net/cronic/">cronic</a>, <a href="http://web.taranis.org/shush/">shush</a> and <a href="http://www.uow.edu.au/~sah/cronwrap.html">cronwrap</a>. They help by e-mailing you only if there is a problem however don't store the cron output itself. To get around that issue I have developed cronologger which can be downloaded from</p>
<p><a href="http://github.com/vvuksan/cronologger">http://github.com/vvuksan/cronologger</a></p>
<p>Cronologger is a BASH script that stores all the cron output into a database. I am using <a href="http://couchdb.apache.org/">CouchDB</a> since it is a great document oriented database that allows me to add attachments (blobs) to a document. I assume it would not be hard to use MongoDB, Riak and others.</p>
<p>Some of the benefits of this utility are</p>
<ul>
<li>Reduce cron spam</li>
<li>Provide the ability to correlate adverse affects by overlaying cron events on e.g. Ganglia graphs</li>
<li>Provide a better report of all the batch jobs that ran, diff them with past jobs if they should look the same, etc.</li>
<li>Provide the ability to easily view what is currently running on the whole infrastructure ie. job_duration &lt; 0</li>
<li>Review historical output</li>
</ul>
<p>I am still working on web GUI for most of these things. I will gladly accept patches and new contributions.</p>
<p>Tip: To get view a list of documents in a CouchDB database you can use the _utils view e.g. http://localhost:5984/_utils/</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/07/06/store-your-cron-output-with-cronologger/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Overlay deploy timeline on Ganglia graphs</title>
		<link>http://blog.vuksan.com/2010/06/28/overlay-deploy-timeline-on-your-ganglia-graphs/</link>
		<comments>http://blog.vuksan.com/2010/06/28/overlay-deploy-timeline-on-your-ganglia-graphs/#comments</comments>
		<pubDate>Mon, 28 Jun 2010 15:55:54 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Ganglia RRDtool]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=232</guid>
		<description><![CDATA[Don't you sometimes wish you could have a visual indicator of when code has been deployed in production. Something like this This is how you can add deploy timeline to your Ganglia graphs or for that matter to any tool that uses RRDs such as Cacti, Munin, Collectd etc. Background RRDtool supports so called VRULEs [...]]]></description>
			<content:encoded><![CDATA[<p>Don't you sometimes wish you could have a visual indicator of when code has been deployed in production. Something like this <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/06/deploy_timeline.png"><img class="alignnone size-full wp-image-233" title="Deploy time line on the load graph" src="http://blog.vuksan.com/wp-content/uploads/2010/06/deploy_timeline.png" alt="Shows deploy time line on a load graph" width="577" height="224" /></a></p>
<p>This is how you can add deploy timeline to your Ganglia graphs or for that matter to any tool that uses RRDs such as Cacti, Munin, Collectd etc.</p>
<h3>Background</h3>
<p>RRDtool supports so called <a href="http://oss.oetiker.ch/rrdtool/doc/rrdgraph_graph.en.html">VRULEs</a> which are</p>
<h4 style="padding-left: 30px;"><a id="IVRULE_time_color__legend___dashes__on_s__off_s__on_s_off_s________dash_offset_offset__" title="click to go to top of document" href="http://oss.oetiker.ch/rrdtool/doc/rrdgraph_graph.en.html#___top"><strong>VRULE</strong><strong>:</strong><em>time</em><strong>#</strong><em>color</em>[<strong>:</strong><em>legend</em>][<strong>:dashes</strong>[<strong>=</strong><em>on_s</em>[,<em>off_s</em>[,<em>on_s</em>,<em>off_s</em>]...]][<strong>:dash-offset=</strong><em>offset</em>]]</a></h4>
<p style="padding-left: 30px;">Draw a vertical line at <em>time</em>. Its color is composed from three hexadecimal numbers specifying the rgb color components (00 is off, FF is maximum) red, green and blue followed by an optional alpha. Optionally, a legend box and string is printed in the legend section. <em>time</em> may be a number or a variable from a <strong>VDEF</strong>. It is an error to use <em>vname</em>s from <strong>DEF</strong> or <strong>CDEF</strong> here. Dashed lines can be drawn using the <strong>dashes</strong> modifier. See <strong>LINE</strong> for more details.</p>
<p>What we want to do is add a VRULE for each deployment. For example those three lines above have been generated using these VRULEs</p>
<div id="_mcePaste" style="padding-left: 30px;">VRULE:1277731886#FF00FF:"Deploys" VRULE:1277721886#FF00FF VRULE:1277711886#FF00FF</div>
<h3>Implementation</h3>
<p>Easiest way to add these to Ganglia is to modify graph.php in Ganglia Web. You need to look for following two lines at the end of the file</p>
<pre>$command .=  array_key_exists('extras', $rrdtool_graph) ? ' '.$rrdtool_graph['extras'].' ' : '';
$command .=  " $rrdtool_graph[series]";</pre>
<p>Then append your own VRULEs ie.</p>
<pre>$command .= " VRULE:" . $time . "#FF00FF:\"Deploys\"";</pre>
<p>Obviously you have to pull in the $time info from where you keep track of your deploy times. You can also get creative by using different colors for different deploys, change legend labels, add VRULEs to only certain graphs ie. load, CPU etc. This is a quick and dirty way to do it</p>
<pre>$deploy_times = array(1278082860,1279393200);
foreach ( $deploy_times as $key =&gt; $time ) {
  # Put deploys label only once.
  if ( $key == 0 )
     $command .= " VRULE:" . $time . "#FF00FF:\"Deploys\"";
  else
     $command .= " VRULE:" . $time . "#FF00FF";
}
</pre>
<p>Now you just have to make sure you append deploy times in the array.</p>
<h3>Alternate implementations</h3>
<p>Alternate implementation is to create a RRD file whenever you do deploys then overlay that graph on top of an existing graph. Trouble is you have to worry about scaling the graph. Never could get it quite right.</p>
<h3>Credit</h3>
<p>Thanks goes to the <a href="http://circonus.com/">Circonus</a> guys <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  since they made me think of vertical lines instead of trying the RRD overlay. Also thanks to <a href="https://twitter.com/toredash">@toredash</a> for pointing me in the right RRDtool direction by suggesting HRULE.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/06/28/overlay-deploy-timeline-on-your-ganglia-graphs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

