<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Vladimir Vuksan&#039;s blog</title>
	<atom:link href="http://blog.vuksan.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.vuksan.com</link>
	<description>Documenting the systems and network infrastructure madness</description>
	<lastBuildDate>Tue, 03 Jan 2012 03:50:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>RESTful way to manage your databases</title>
		<link>http://blog.vuksan.com/2012/01/02/restful-way-to-manage-your-databases/</link>
		<comments>http://blog.vuksan.com/2012/01/02/restful-way-to-manage-your-databases/#comments</comments>
		<pubDate>Tue, 03 Jan 2012 03:43:35 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DevOps]]></category>
		<category><![CDATA[Systems Management]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=490</guid>
		<description><![CDATA[I have a need in my development environment to easily create/drop mySQL databases and users. Initially I was gonna implement a simple hacky HTTP GET method but was dissuaded by Ben Black from doing so. He suggested I write a proper RESTful interface. Without further ado I present to you dbrestadmin https://github.com/vvuksan/dbrestadmin It is my [...]]]></description>
			<content:encoded><![CDATA[<p>I have a need in my development environment to easily create/drop mySQL databases and users. Initially I was gonna implement a simple hacky HTTP GET method but was dissuaded by <a href="https://twitter.com/b6n">Ben Black </a>from doing so. He suggested I write a proper RESTful interface. Without further ado I present to you dbrestadmin</p>
<p><a href="https://github.com/vvuksan/dbrestadmin">https://github.com/vvuksan/dbrestadmin</a></p>
<p>It is my first foray into writing RESTful services so things may be rough around the edges. However it allows you to do following</p>
<ul>
<li>manage multiple database servers</li>
<li>create/drop databases</li>
<li>list databases</li>
<li>create/drop users</li>
<li>list users</li>
<li>give user grants</li>
<li>view grants given to the user</li>
<li>view database privileges on a particular database given to a user</li>
</ul>
<p>For example need to create a database called testdb on dbserver ID=0 use this cURL command</p>
<pre>curl -X POST http://myhost/dbrestadmin/v1/databases/0/dbs/testdb</pre>
<p>Create a user test2 with password test</p>
<pre>curl -X POST "http://localhost:8000/dbrestadmin/v1/databases/0/users/test2@localhost" -d "password=test"</pre>
<p>Give test2 user all privileges on testdb</p>
<pre>curl -X POST "http://localhost:8000/dbrestadmin/databases/0/users/test2@'localhost'/grants" -d "grants=all privileges&amp;database=testdb"</pre>
<p>There is more. You can see all of the methods here</p>
<p><a href="https://github.com/vvuksan/dbrestadmin/blob/master/API.md">https://github.com/vvuksan/dbrestadmin/blob/master/API.md</a></p>
<p>Improvements and constructive criticism welcome</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2012/01/02/restful-way-to-manage-your-databases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Operating on Dell RAID arrays cheatsheet</title>
		<link>http://blog.vuksan.com/2011/11/23/operating-on-dell-raid-arrays-cheatsheet/</link>
		<comments>http://blog.vuksan.com/2011/11/23/operating-on-dell-raid-arrays-cheatsheet/#comments</comments>
		<pubDate>Wed, 23 Nov 2011 21:16:47 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=487</guid>
		<description><![CDATA[I have to infrequently add new drives to Dell RAID arrays like H700. For some reason it takes me couple searches to find the info so here so I can find it later. List all drives /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL Create a RAID array (e.g. RAID 0) /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r0 [32:4, 32:5] -aALL List working RAID [...]]]></description>
			<content:encoded><![CDATA[<p>I have to infrequently add new drives to Dell RAID arrays like H700. For some reason it takes me couple searches to find the info so here so I can find it later.</p>
<p><strong>List all drives</strong></p>
<pre>/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL</pre>
<p><strong>Create a RAID array (e.g. RAID 0)</strong></p>
<pre>/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdAdd -r0 [32:4, 32:5] -aALL</pre>
<p><strong>List working RAID arrays</strong></p>
<pre>/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL</pre>
<p>Confirm you got the right RAID array e.g. Virtual Disk 1</p>
<pre>/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -L1 -aALL</pre>
<p><strong>Delete RAID array</strong></p>
<pre>/opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -aALL</pre>
<pre></pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/11/23/operating-on-dell-raid-arrays-cheatsheet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Use fantomTest to test web pages from multiple locations</title>
		<link>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/</link>
		<comments>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/#comments</comments>
		<pubDate>Tue, 27 Sep 2011 23:39:49 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=472</guid>
		<description><![CDATA[In my previous I introduced Testing your web pages with fantomtest. I have recently added ability to test the same page from multiple sites within the same interface. You simply install the copy of fantomTest on a remote site then configure your primary site to access it. For example this is a test of Google [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous I introduced <a href="http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/">Testing your web pages with fantomtest</a>. I have recently added ability to test the same page from multiple sites within the same interface. You simply install the copy of fantomTest on a remote site then configure your primary site to access it. For example this is a test of Google from my laptop.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-goog.png"><img class="alignnone size-full wp-image-473" title="FantomTest Google " src="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-goog.png" alt="" width="959" height="500" /></a></p>
<p>Looks like my network connection is really slow <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> . Changing the testing site to Croatia where I have a server I get</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-hr.png"><img class="alignnone size-full wp-image-474" title="FantomTest Google Croatia" src="http://blog.vuksan.com/wp-content/uploads/2011/09/fantomtest-hr.png" alt="" width="980" height="477" /></a></p>
<p>Slightly different since Google redirects me to their localized Google site however it leads me to believe that it's my connection that is slow not Google.</p>
<p>Any number of  "remotes" can be added. Want it ? Get it @GitHub</p>
<p><a href="https://github.com/vvuksan/fantomtest">https://github.com/vvuksan/fantomtest</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/09/27/fantomtest-multiple-locations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Jenkins as a Cron Server</title>
		<link>http://blog.vuksan.com/2011/08/22/using-jenkins-as-a-cron-server/</link>
		<comments>http://blog.vuksan.com/2011/08/22/using-jenkins-as-a-cron-server/#comments</comments>
		<pubDate>Mon, 22 Aug 2011 21:33:49 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Systems Management]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=466</guid>
		<description><![CDATA[There are a number of problems with cron which cause lots of grief for system administrators with big ones being manageability, cron-spam and auditability. To fix some of these issues I have lately started using Jenkins. Jenkins is an open source Continuous Integration server it has lots of features that make it a great cron [...]]]></description>
			<content:encoded><![CDATA[<p>There are a number of problems with cron which cause lots of grief for system administrators with big ones being manageability, cron-spam and auditability. To fix some of these issues I have lately started using <a href="http://jenkins-ci.org/">Jenkins</a>. <a href="http://jenkins-ci.org/">Jenkins</a> is an open source Continuous Integration server it has lots of features that make it a great cron replacement for a number of uses. These are some of the problems it solves for me</p>
<h3>Auditability</h3>
<p>Jenkins can be configured to retain logs of all jobs that it has run. You can set it up to keep last 10 runs or you can set it up to keep only last 2 weeks of logs. This is incredibly useful since sometimes jobs can fail silently so it's useful to have the output instead of sending it to /dev/null.</p>
<h3>Centralized management</h3>
<p>I have my most important jobs centralized. I can export all Jenkins jobs as XML and check it into a repository. If I need to execute jobs on remote hosts I simply have Jenkins ssh and execute command remotely. Alternatively you can use <a href="https://wiki.jenkins-ci.org/display/JENKINS/Distributed+builds">Jenkins slaves</a>.</p>
<h3>Cron Spam</h3>
<p>Cron spam is a common problem with solutions such as <a href="http://habilis.net/cronic/">this</a>, <a href="http://iamthewalr.us/blog/2007/10/howto-make-cron-not-spam-you-to-death/">this</a> and <a href="http://blog.dynamichosting.biz/2010/11/01/stop-crond-from-sending-e-mails/#more-83">this</a>. To avoid this condition I only have Jenkins alert me when a particular job fails ie. a job exits with return code other than 0.  In addition you can use the awesome <a href="https://wiki.jenkins-ci.org/display/JENKINS/Text-finder+Plugin">Jenkins Text Finder</a> plugin which allows you to specify words or regular expressions to look for in console output. They can be used to mark a "job" unstable. For example in text finder config I checked</p>
<p style="padding-left: 30px;">X Also search the console output</p>
<p style="padding-left: 30px;">and specified</p>
<p style="padding-left: 30px;">Regular expression ([Ee]rror*).*</p>
<p>This has saved our bacon since we used the automysqlbackup.sh script which "swallows" up the errors codes from the mysqldump command and exits normally. Text Finder caught this</p>
<p><code>mysqldump: Error 2020: Got packet bigger than 'max_allowed_packet' bytes when dumping table `users` at row: 234<br />
</code></p>
<p>Happily we caught this one on time.</p>
<h3>Job dependency</h3>
<p>Often you will have job dependencies ie. main backup job where you first dump a database locally then upload it somewhere off-site or to the cloud. The way we have done this in the past is to leave a sufficiently large window between the first job and consecutive job to be sure first job has finished. This says nothing about what to do if the first job fails. Likely the second one will too. With Jenkins I no longer have to do that. I can simply tell Jenkins to trigger "backup to the cloud" once local DB backup concludes successfully.</p>
<h3>Test immediately</h3>
<p>While you are adding a job it's useful to test whether job runs properly. With cron you often had to wait until the job executed at e.g. 3 am in the morning to discover that PATH wasn't set properly or there was some other problem with your environment. With Jenkins I can click Build Now and job will run immediately.</p>
<h3>Easy setup</h3>
<p>Setting up jobs is easy. I have engineers set up their own job by copying an existing job and modifying it to do what they need to do. I don't remember last time someone asked me how to do it <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h3>What I don't use Jenkins for</h3>
<p>I don't use Jenkins to run jobs that collect metrics or anything that has to run too often.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/08/22/using-jenkins-as-a-cron-server/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Testing your web pages with fantomtest</title>
		<link>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/</link>
		<comments>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 13:28:32 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=452</guid>
		<description><![CDATA[Coming from web operations background my web site/page monitoring had largely focused at looking at metrics such as average request duration, 90th percentile request duration etc. These are all great metrics however through Velocity Conferences I have come to appreciate that there is a lot more to web performance than simply knowing how long it [...]]]></description>
			<content:encoded><![CDATA[<p>Coming from web operations background my web site/page monitoring had largely focused at looking at metrics such as average request duration, 90th percentile request duration etc. These are all great metrics however through <a href="http://velocityconf.com/">Velocity Conferences</a> I have come to appreciate that there is a lot more to web performance than simply knowing how long it takes to load HTML in a web page. As a result I have been looking for ways to try to get better metrics by utilizing real browsers instead of Perl/Ruby/Python scripts. For some time I have been playing with Selenium RC to give me an easy way to test and time my web application. Unfortunately I found it heavy and slow. At last Velocity conference I was fortunate enough to see a demo of <a href="http://phantomjs.org/">PhantomJS</a>. PhantomJS is a semi-headless webkit browser with Javascript support. What I really appreciated about it is that it is light weight, fast and very easy to instrument using Javascript. In addition it includes a number of useful examples such as netsniff.js which output a HTTP Archive (HAR) of requests to a certain web page. From a HAR file you can builds among other things waterfall charts. There are a number of services you can use to have your site tested for free e.g. <a href="http://webpagetest.org/">webpagetest.org</a>. Limitation is that they can't test your intranet infrastructure since that is usually behind a firewall or it doesn't allow you to test remote sites that are connected to your intranet via a VPN.</p>
<p>That is why I'm introducing fantomTest. A simple web application that allows you to generate waterfall graphs using PhantomJS. It will also take a screenshot of a rendered page. Here is what that looks like</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2011/08/screenshot2.png"><img class="alignnone size-large wp-image-458" title="Google Waterfall Chart" src="http://blog.vuksan.com/wp-content/uploads/2011/08/screenshot2-1024x322.png" alt="" width="1024" height="322" /></a></p>
<p>What's interesting in this particular case is that Google is not utilizing web performance recommendations by using a HTTP redirect from google.com to www.google.com.</p>
<p>Anyways to get fantomTest go to</p>
<p><a href="https://github.com/vvuksan/fantomtest">https://github.com/vvuksan/fantomtest</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/08/02/testing-your-web-pages-with-fantomtest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Monitoring links and monitoring anti-patterns video</title>
		<link>http://blog.vuksan.com/2011/06/05/monitoring-links-and-monitoring-anti-pattern-video/</link>
		<comments>http://blog.vuksan.com/2011/06/05/monitoring-links-and-monitoring-anti-pattern-video/#comments</comments>
		<pubDate>Mon, 06 Jun 2011 01:37:42 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=443</guid>
		<description><![CDATA[John Vincent aka. lusis has started an interesting conversation surrounding monitoring on Freenode on channel he named ##monitoringsucks. He has also done an awesome job of starting up a Github project of the same name that is shaping up to be a nice collection of links to tools and blog posts. Check it out https://github.com/monitoringsucks/ [...]]]></description>
			<content:encoded><![CDATA[<p>John Vincent aka. <a href="https://twitter.com/lusis">lusis</a> has started an interesting conversation surrounding monitoring on Freenode on channel he named ##monitoringsucks. He has also done an awesome job of starting up a Github project of the same name that is shaping up to be a nice collection of links to tools and blog posts. Check it out</p>
<p><a href="https://github.com/monitoringsucks/">https://github.com/monitoringsucks/</a></p>
<p>Also we just got a hold of  the monitoring anti-patterns Ignite Talk from <a href="http://devopsdays.org/">Devopsdays Boston</a> by Alexis Lê-Quôc aka. <a href="https://twitter.com/#!/alq">@alq</a>. It is a short video (5 minutes) so it's definitely worth seeing.</p>
<p><script src="http://admin.brightcove.com/js/BrightcoveExperiences.js" type="text/javascript"></script> <object id="myExperience977787579001" class="BrightcoveExperience"><param name="bgcolor" value="#FFFFFF" /><param name="width" value="480" /><param name="height" value="270" /><param name="playerID" value="831686916001" /><param name="playerKey" value="AQ~~,AAAAwXeOKzE~,gDtKhDWZs1sPjB37DFykmXTwNJL_QzPW" /><param name="isVid" value="true" /><param name="isUI" value="true" /><param name="dynamicStreaming" value="true" /><param name="@videoPlayer" value="977787579001" /></object> <!--  This script tag will cause the Brightcove Players defined above it to be created as soon as the line is read by the browser. If you wish to have the player instantiated only after the rest of the HTML is processed and the page load is complete, remove the line. --> <script type="text/javascript">// <![CDATA[
   brightcove.createExperiences();
// ]]&gt;</script><!-- End of Brightcove Player --></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/06/05/monitoring-links-and-monitoring-anti-pattern-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Use your trending data for alerting</title>
		<link>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/</link>
		<comments>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/#comments</comments>
		<pubDate>Tue, 19 Apr 2011 19:59:49 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Nagios]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=440</guid>
		<description><![CDATA[This post will deal with helping you use the data you already have to do alerting. It is most helpful for people running Nagios or it's variants such as Icinga, Netreo etc. It could likely be used with other decoupled alerting systems (not Zabbix or Zenoss though since they do their own trending). Recently I [...]]]></description>
			<content:encoded><![CDATA[<p>This post will deal with helping you use the data you already have to do alerting. It is most helpful for people running Nagios or it's variants such as Icinga, Netreo etc. It could likely be used with other decoupled alerting systems (not Zabbix or Zenoss though since they do their own trending).</p>
<p>Recently I came to a realization that lots of sysadmins are unaware that they could easily use trending data they already capture with systems such as Ganglia, Graphite, Collectd, Munin etc. to do alerting. Standard way of doing health checks of remote nodes in Nagios is to install the <a href="http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE-%252D-Nagios-Remote-Plugin-Executor/details">Nagios Remote Plugin Executor aka. NRPE</a> which allows you to execute Nagios plugins on remote nodes and pipe output to the Nagios server. NRPE does the job however has three major disadvantages</p>
<ol>
<li>It is another daemon that needs to run on the remote host possibly introducing security concerns</li>
<li>Depending on the load of the machine can be slow thus bogging down the Nagios server</li>
<li>Last and most important is that commonly it's used to alert on common metrics such as disk, load, CPU, swap which you should be trending anyways.</li>
</ol>
<p>Instead what you ought to be doing is use trending data for alerting. I can think of at least 4 reasons to do so</p>
<ol>
<li>You may already be collecting pertinent data ie. system load, swap, CPU utilization</li>
<li>If you are alerting on a particular metric you should likely be trending it</li>
<li>It's fast</li>
<li>Allows you to do more sophisticated checks easily ie. alert me if more than 5 hosts have a load greater than 5 etc.</li>
</ol>
<p>Years ago I used Ganglia Web PHP code to write my own generic <a href="http://vuksan.com/linux/nagios_scripts.html#check_ganglia_metrics">Nagios Ganglia plugin</a>. This has served me well. Most recently <a href="Michael Conigliaro">Michael Conigliaro</a> rewrote the script in Python making it more versatile and more powerful. You can download it from here</p>
<p><a href="https://github.com/ganglia/ganglia_contrib/tree/master/nagios">https://github.com/ganglia/ganglia_contrib/tree/master/nagios</a></p>
<p>In a nutshell what it does is download the whole metrics tree ie. list of all hosts with their associated metrics. Caches it for a configurable amount of time then uses <a href="http://packages.python.org/NagAconda/plugin.html">NagAconda</a> to support all the threshold reporting as defined in <a href="http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT">Nagios developer guidelines</a>.</p>
<p>Another alternative if you have a very large site is Ganglios which was opensourced by guys at Linden Lab. Their problem is/was that they have thousands of hosts and downloading the whole metrics tree takes ~15 seconds so they have separated the logic that downloads the metric tree and one that does alerting. You can download Ganglios from</p>
<p><a href="https://bitbucket.org/maplebed/ganglios">https://bitbucket.org/maplebed/ganglios</a></p>
<p>This can easily be adapted to work with your trending system of choice.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/04/19/use-your-trending-data-for-alerting/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>JSON representation for graphs in Ganglia</title>
		<link>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/</link>
		<comments>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/#comments</comments>
		<pubDate>Mon, 21 Feb 2011 02:38:46 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=431</guid>
		<description><![CDATA[Recently thanks to work done by Alex Dean aka. @mostlyalex Ganglia UI supports defining custom graphs using JSON. Prior to this only way to create custom graphs was by writing custom PHP code. This has two major problems ie. lots of people are not comfortable writing or modifying PHP code and second you have to [...]]]></description>
			<content:encoded><![CDATA[<p>Recently thanks to work done by Alex Dean aka. <a href="https://twitter.com/mostlyalex">@mostlyalex</a> Ganglia UI supports defining custom graphs using JSON. Prior to this only way to create custom graphs was by writing custom PHP code. This has two major problems ie. lots of people are not comfortable writing or modifying PHP code and second you have to target a particular graphing engine e.g. rrdtool. As I have written in the past we are gonna be supporting both rrdtool and graphite for graphing so having a common way to describe graphs has been one of our goals.</p>
<p>To describe a custom graph you would create a JSON file similar to this one</p>
<pre>{
 "report_name" : "network_report",
 "report_type" : "standard",
 "title" : "Network Report",
 "vertical_label" : "Bytes/sec",
 "series" : [
 { "metric": "bytes_in", "color": "33cc33", "label": "In", "line_width": "2", "type": "line" },
 { "metric": "bytes_out", "color": "5555cc", "label": "Out", "line_width": "2", "type": "line" }
 ]
}</pre>
<p>This will create a line graph with bytes_in and bytes_out metrics. Since hostname and cluster are not specified it is assumed that we want metrics for the current host we are viewing. You could however specify a particular host and metric you want to graph by adding hostname and cluster attributes to series ie.</p>
<pre>{
 "report_name" : "our_load_report",
 "report_type" : "standard",
 "title" : "Load Report vs. Database Load",
 "vertical_label" : "Loads",
 "series" : [
 { "metric": "load_one", "color": "3333bb", "label": "Load 1", "line_width": "2", "type": "line" },
 { "hostname": "db1.domain.com", "clustername": "Databases", "metric": "load_one", "color": "44ddbb", "label": "DB1 Load 1", "line_width": "2", "type": "line" },
 ]
}</pre>
<p>To use the reports all you have to do is put the report in the $GANGLIA_WEB_ROOT/graph.d directory. Name them something_report.json and it will be available for any host in the cluster. There is one important thing to note. By default graphing function will look for PHP definitions for graphs as those in theory provide more power and flexibility and if those are not available use JSON definition.</p>
<h3>Types of graphs</h3>
<p>Currently both line and stacked graphs are supported. Look in graph.d/ directory for additional examples.</p>
<h3>Future</h3>
<p>I am particularly excited about this feature as it allows us to define <a href="http://blog.vuksan.com/2010/06/05/beauty-of-aggregate-line-graphs/">aggregate graphs</a> easily. There is even an alpha implementation of functionality which would allow you to specify a metric and a regex host entry and you would end up with an aggregate graph <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h3>Download location</h3>
<p>Latest version of the UI can be downloaded either from <a href="http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/">Ganglia Monitor Web 2.0 SVN branch</a> or you can get it on <a href="https://github.com/vvuksan/ganglia-misc">Github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2011/02/20/json-representation-for-graphs-in-ganglia/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Misconceptions about RRD storage</title>
		<link>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/</link>
		<comments>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/#comments</comments>
		<pubDate>Tue, 14 Dec 2010 22:04:57 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Systems Management]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ganglia rrd monitoring]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=424</guid>
		<description><![CDATA[I want to address the misconceptions about RRD (Round-Robin Database) that seem to crop up often even among seasoned sysadmins. Complaints can be summarized with these two points RRD doesn't offer high resolution ie. after about an hour it's all averages and I want to knows what was the metric value last year at this [...]]]></description>
			<content:encoded><![CDATA[<p>I want to address the misconceptions about RRD (Round-Robin Database) that seem to crop up often even among seasoned sysadmins. Complaints can be summarized with these two points</p>
<ul>
<li>RRD doesn't offer high resolution ie. after about an hour it's all averages and I want to knows what was the metric value last year at this hour and minute</li>
<li>Data drops off/is destroyed after a year - I want to keep my data forever, disk is cheap etc.</li>
</ul>
<p>Those are valid points however none of them are the fault of RRD. RRD is a circular buffer so in order to be able to write into it you have to precreate it (otherwise it wouldn't be a circular buffer <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ). Obviously more data points you store bigger the RRD file will be. To illustrate the point <a href="http://ganglia.info/">Ganglia Monitoring</a> uses following defaults to create RRDs</p>
<p>RRAs "RRA:AVERAGE:0.5:1:244" "RRA:AVERAGE:0.5:24:244" "RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:244" "RRA:AVERAGE:0.5:5760:374"</p>
<p>This will create multiple circular buffers within the same RRD database file. In order to make sense out of this you need to know what the polling interval is ie. how often do you write into RRDs. In Ganglia's case the default is 15 seconds so</p>
<ul>
<li>"RRA:AVERAGE:0.5:1:244" says write actual values (:1:) for every polling interval. Save last 244 of those so in our case we'll have 61 minutes worth of actual data points. Since it's a circular buffer data older than 61 minutes will be "dropped"</li>
<li>"RRA:AVERAGE:0.5:24:244" says average 24 values (:24:),  24 * 15 seconds = 360 seconds = 6 minutes. 244 of those times 6 is a whole day</li>
<li>You can do the next two <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
<li>Last one  "RRA:AVERAGE:0.5:5760:374" says average whole day (5760 * 15 seconds = 1440 minutes = 1 day) worth of values and store it in 374 points ie. little more than a year</li>
</ul>
<p>When graphing RRDtool is smart enough to use the buffer which gives you the most data points. To store all this data RRD file will use about 12kBytes. Thus if you want higher resolution you will need to change the definition e.g. you could do this</p>
<p>"RRA:AVERAGE:0.5:1:2137440"</p>
<p>which will give you one year worth of data points with no averaging with 15 second interval. Trouble is the size of this RRD file is 17 Mbytes. <span style="text-decoration: line-through;">This may not seem as bad but one of the RRD drawbacks is that every time you add data to an RRD the whole file is written over so if you have 1000 metrics you can be potentially writing 17 GBs of data every 15 seconds</span>. This may be a problem depending how many metrics you are keeping track of. There are alternatives which increase throughput such as storing RRDs in RAMdisk or using rrdcached. Alternatively you can opt to keep 2 weeks worth of data points with e.g.</p>
<p>"RRA:AVERAGE:0.5:1:81984"</p>
<p>which will result in size of about 650 kBytes per RRD file. Or you can do something else altogether. Flip side of RRD is that there are no indexes to maintain, no tables that need to be rotated.</p>
<p><strong>Update:</strong> I was wrong about the whole RRD file needing to be updated. In retrospect it makes sense and I apologize for providing the wrong info. You can read comment from Tobi Oetiker (creator of rrdtool) in comments below for more detail. This is actually awesome news since there is very little downside in making larger RRDs.</p>
<p>As far as Ganglia you can modify the defaults in /etc/ganglia/gmetad.conf file. You can also use gmetad-python which allows you to write your own plugins and store metric data in both RRD format, SQL or any other storage engine of your choice.</p>
<p>More on RRDtool can be found here</p>
<p><a href="http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html">http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/12/14/misconceptions-about-rrd-storage/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Rethinking Ganglia Web UI</title>
		<link>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/</link>
		<comments>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/#comments</comments>
		<pubDate>Sat, 11 Dec 2010 01:09:47 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Systems Management]]></category>

		<guid isPermaLink="false">http://blog.vuksan.com/?p=401</guid>
		<description><![CDATA[I have been a long time fan of Ganglia. Ganglia is a scalable distributed monitoring system initially developed for high-performance computing systems such as clusters and Grids. Today Ganglia is being used by some of the largest web properties such as Facebook, Twitter, Etsy, etc. as well as tons of smaller organizations. Some of Ganglia [...]]]></description>
			<content:encoded><![CDATA[<p>I have been a long time fan of Ganglia. Ganglia is a scalable distributed monitoring system initially developed for high-performance computing systems such as clusters and Grids. Today Ganglia is being used by some of the largest web properties such as Facebook, Twitter, Etsy, etc. as well as tons of smaller organizations. Some of Ganglia benefits are</p>
<ul>
<li>Push based metrics ie. a lightweight agent on hosts that need to be monitored</li>
<li>Lots of basic metrics by default such as load, cpu utilization, memory utilization</li>
<li>Trivial to add new metrics ie. execute the gmetric command with metric value and graph automatically shows up</li>
<li>Decent web interface that allows you to easily drill down when troubleshooting problems</li>
</ul>
<p>I have used other monitoring systems such as Cacti, Zenoss and Zabbix and found them lacking since they were overly complicated, hard to configure and customize. That said I have also had misgivings about certain parts of the Ganglia UI. Specifically what I missed were following features</p>
<ol>
<li>Ability to search hosts and metrics - looking for specific host or metric gets cumbersome even on clusters with 20-30 hosts</li>
<li>Ability to create arbitrary groupings of host metrics on one page ie. a page with web response time for each web server and mySQL lock time would be something you'd have to write custom code for</li>
<li>Easy way to create custom graphs ie. either aggregate line graphs or stacked graphs</li>
<li>Easy way to add custom graphs to either clusters or hosts ie. I have a stacked Apache report showing number of GETs vs. POSTs. It's hard or impossible to show that graph only on webservers but not on mySQL servers.</li>
<li>Mobile (WebKit) optimized experience - minimize zooming/panning etc.</li>
</ol>
<p>Couple months ago on #ganglia Freenode IRC channel we were discussing some of the pitfalls of the UI and the idea of rewriting Ganglia UI was born. As I have been doing quite a bit of work with jQuery in months past I decided to to give it a shot.</p>
<h2>Goals</h2>
<p>My initial goals were</p>
<ol>
<li>Implement basic search functionality ie. one search term that will show matching hosts and metrics</li>
<li>Add a way to add "optional" graphs on per cluster/per host basis ie. have a default set of graphs and allow those to be overriden using cluster or host override config files</li>
<li>Add Views ie. ability to group host/metrics</li>
<li>Add Mobile/Webkit View</li>
<li>Store view and optional graphs config information in a format that can be easily manipulated by web UI, config management system or by hand - this is one of the key omissions in most monitoring setups where adding/removing hosts requires either manual intervention or kludgy hacks. As someone who has had to spend hours manually clicking around Zabbix interface whenever we added a new server this had major importance</li>
</ol>
<h2>Implementation</h2>
<p>Initially there was an idea to rebuild the whole interface from scratch which we still may do but I decided that that would be too much work especially since I wasn't absolutely sure whether my intended changes would make sense for most people. Thus I decided to modify the existing UI.</p>
<p>So far these are the features that have been implemented</p>
<h3>Visual aides</h3>
<p>In cluster view next to each host now you'll see the full hostname in text on top of the graph. Same goes for metric names  in host view. Now even if you have hundreds of metrics you can click CTRL-F in your browser and find the metric quickly. Also there is a hidden anchor next to each metric which is used by the search tab.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_visual_aides1.png"><img title="Ganglia Visual Aides - Cluster View" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_visual_aides1.png" alt="" width="604" height="289" /></a></p>
<p>Doesn't seem like much until you need it <img src='http://blog.vuksan.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<h3>Search</h3>
<p>Search tab allows you to type in a single term which will match hosts and metrics. It will search as you type. Hosts first, metrics on host second. Clicking on hosts opens a new window with the view of the host. Clicking on a particular metric takes you to the metric in question.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_search.png"><img class="alignnone size-full wp-image-415" title="Ganglia Search Results" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_search.png" alt="" width="375" height="493" /></a></p>
<h3>Views</h3>
<p>Views are defined using JSON configuration files. One JSON file per view. There are two types of views, standard and regex views. For example standard view will look like this</p>
<pre>{  "view_name":"default",
   "items":[
      {"hostname":"host1.domain.com","graph":"cpu_report"},
      {"hostname":"host2.domain.com","graph":"apache_report"}
    ],
    "view_type":"standard"
}</pre>
<p>It will group cpu report from host1 and apache_report for host2. Regex view allows you to use regular expressions to define hosts (soon also metrics) ie. you want to group all hosts that have imap, amavis or smtp in their names. That view definition would look something like this</p>
<pre>{  "view_name":"mailservers",
    "items":[
      {"hostname":"(imap|amavis|smtp)", "graph":"cpu_report"}
    ],
    "view_type":"regex"}</pre>
<p>If you don't want to edit JSON config files by hand you can use the UI to create standard views ie. first create a view then as you browse hosts there is a plus sign next to each graph. Clicking on it displays a dialog which allows you to add that particular host/metric to a view e.g.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_add_metric_to_view.png"><img class="alignnone size-full wp-image-416" title="Add metric to a Ganglia view" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_add_metric_to_view.png" alt="" width="529" height="300" /></a></p>
<h3>Automatic rotation</h3>
<p>Allows you to automatically rotate a view. It is an integration of <a href="http://blog.vuksan.com/2010/06/16/gangliaview-automatically-rotate-ganglia-metrics/">GangliaView</a> with Views. What's especially nice is that if you have multiple monitors you can open up separate browser windows and select different views to rotate.</p>
<h3>Mobile view</h3>
<p>There is a functional mobile view which provides mobile view of Views, Clusters and Search ie. there is very little panning or zooming. Also we are using lots of preloading ie. first page you open contains lots of hidden sub-pages in order to save on having to do subsequent requests.</p>
<p>You can view some of the <a href="http://www.flickr.com/photos/51166390@N05/sets/72157625551485278/">screenshots </a>on Flickr.</p>
<h3>Optional Graphs</h3>
<p>You can specify which optional graphs you want displayed for each host or cluster. Similar to views these are configured via JSON config files e.g. this is the default list of graphs</p>
<pre>{
	"included_reports": ["load_report","mem_report","cpu_report","network_report","packet_report"]
}</pre>
<p>You can exclude any of the default included graphs or include ones you want e.g.</p>
<p><a href="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_edit_optional_graphs.png"><img class="alignnone size-full wp-image-418" title="Ganglia Edit Optional Graphs" src="http://blog.vuksan.com/wp-content/uploads/2010/12/ganglia_edit_optional_graphs.png" alt="" width="557" height="421" /></a></p>
<h3>Screencast</h3>
<p>If you would like to see some of these features in action you can look at <a href="http://vuksan.com/ganglia-ui.html">these screencasts</a>.</p>
<h3>Download</h3>
<p>Ready to try ? Wait no more and check it out from SVN at</p>
<p><a href="http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/">http://ganglia.svn.sourceforge.net/svnroot/ganglia/branches/monitor-web-2.0/ </a></p>
<p><strong>Future</strong></p>
<p>In the future we are looking into polishing the Graphite/Ganglia integration (perhaps about that in a next post), add integrations with e.g. Nagios (you can see a hint of it in the add metric to view screenshot above), Logstash. Also another upcoming feature will be aggregate metrics and quick views. Full TODO list can be found here</p>
<p><a href="http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-web-2.0/TODO">http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-web-2.0/TODO</a></p>
<h3>Acknowledgements</h3>
<p>I'd like to thank Erik Kastner for helping on the Graphite/Ganglia integration. Ben Hartshorne for test driving the UI and providing a number of good suggestions/ideas.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.vuksan.com/2010/12/10/rethinking-ganglia-web-ui/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>

