<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jeff Blaine</title>
	<atom:link href="http://www.kickflop.net/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kickflop.net/blog</link>
	<description></description>
	<lastBuildDate>Fri, 20 Apr 2012 20:26:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Parasitic Losses</title>
		<link>http://www.kickflop.net/blog/2012/04/20/parasitic-losses/</link>
		<comments>http://www.kickflop.net/blog/2012/04/20/parasitic-losses/#comments</comments>
		<pubDate>Fri, 20 Apr 2012 19:58:10 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[DevOps]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1536</guid>
		<description><![CDATA[Subtitle: &#8220;Derrrr &#8230; alert on stuff.&#8221; For awhile now, we&#8217;ve been in a foggy area with our monitoring and alerting. I&#8217;ve been aware for months that we should be doing more than we currently are. We&#8217;ve been running Nagios for years and do an okay job with it in basic form. We also have Ganglia [...]]]></description>
			<content:encoded><![CDATA[<p>Subtitle: &#8220;Derrrr &hellip; alert on stuff.&#8221;<span id="more-1536"></span></p>
<p>For awhile now, we&#8217;ve been in a foggy area with our monitoring and alerting. I&#8217;ve been aware for months that we should be doing more than we currently are. We&#8217;ve been running <a href="http://www.nagios.org/">Nagios</a> for years and do an okay job with it in basic form. We also have <a href="http://ganglia.info/">Ganglia</a> metrics being collected for all of our servers, but they are largely ignored until needed for debugging. And finally, I&#8217;ve also just started seriously evaluating Graphite + various collectors as of 2 weeks ago. We do not <em>currently</em> perform any alerting based on the data in Ganglia. It is all a slowly moving work in progress.</p>
<p>Today, I just so happened to have a look at Ganglia and noticed 2 servers with elevated status (showing as yellow instead of green). They were both <a href="http://www.openafs.org/">OpenAFS</a> &#8220;database&#8221; servers (we have 4), and for the sake of not sidetracking this blog post, are largely depended on for providing OpenAFS storage volume location information.</p>
<p>It&#8217;s worth pointing out that there was no noticeable OpenAFS service degradation to the clients of the affected servers.</p>
<p>We&#8217;ll only look at one because they both suffered the same problem. Below, you can see to the left side of the graphs the scene I was presented at the time I noticed the problem. We have completely flat load that is elevated, completely flat low CPU idle time, and a completely flat stream of ~360KB/sec total network traffic:</p>
<p><img src="http://www.kickflop.net/blog/wp-content/uploads/2012/04/shiva-load.png" alt="" title="shiva-load" width="397" height="319" class="aligncenter size-full wp-image-1540" /><br />
<img src="http://www.kickflop.net/blog/wp-content/uploads/2012/04/shiva-cpu_idle.png" alt="" title="shiva-cpu_idle" width="397" height="263" class="aligncenter size-full wp-image-1539" /><br />
<img src="http://www.kickflop.net/blog/wp-content/uploads/2012/04/shiva-bytes.png" alt="" title="shiva-bytes" width="397" height="291" class="aligncenter size-full wp-image-1538" /></p>
<p>Though the graphs above show the last 4 hours, digging further back in time (not shown here) showed that this situation started at around 9AM on April 9th. The flat graph shape seen above at the left of all 3 graphs was found in our data from that day and time all the way to today! Hmmm. What changed around 9AM on April 9th? A little digging through my email and our revision control history showed absolutely nothing, so it was time to dive in with no hints.</p>
<p>I found the OpenAFS Volume Location server process, <code>vlserver</code>, eating a steady ~60% of the CPU on each of the 2 servers in question. Additionally, looking into the flat ~360KB/sec network data with <code>snoop</code> showed a solid stream of OpenAFS UDP packets coming from port 7001 on a client node to port 7003 (<code>vlserver</code>!) on this server:</p>
<pre>
# OpenAFS client request to vlserver then response back
rogle.ourdomain -> shiva.ourdomain UDP D=7003 S=7001 LEN=56
shiva.ourdomain -> rogle.ourdomain UDP D=7001 S=7003 LEN=40
</pre>
<p>Very rough calculations indicate at least 2500 of these exchanges were happening per second.</p>
<p>Was this legit traffic from some long-running job created by one of the engineers? Looking more closely with <code>snoop -x</code> showed that the request and response were identical every time. Something was clearly broken somewhere on host <code>rogle</code>.</p>
<p>Once logged into <code>rogle</code>, I found 2 <code>bash</code> processes with high CPU utilization time.  One is shown below.</p>
<pre>
jfivale  19137     1  0 Mar07 ?        03:43:03 bash
</pre>
<p>That&#8217;s not so interesting on its own, but adding in the detail that <code>bash</code> for our users is actually <code>/afs/rcf.ourdomain/some/path/bin/bash</code> changes things quite a bit.</p>
<p>Brain says:</p>
<blockquote><p>Wait a second. Some number of days ago we had a weird situation where we had to kill off 5-10 <code>bash</code> processes on several hosts due to them eating loads of CPU time. They had eaten far more CPU time than these, but this is getting familiar.</p></blockquote>
<p>Sure enough, when I attached to the processes with <code>strace -f</code>, they showed the same behavior as the broken processes a week or so ago:</p>
<pre>
...
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
rt_sigreturn(0x7)                       = 0
--- SIGBUS (Bus error) @ 0 (0) ---
...
</pre>
<p>or alternatively:</p>
<pre>
...
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
rt_sigreturn(0xb)                       = 316804680
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
rt_sigreturn(0xb)                       = 316804680
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
rt_sigreturn(0xb)                       = 316804680
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
rt_sigreturn(0xb)                       = 316804680
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
...
</pre>
<p>Once I had killed off the offending <code>bash</code> processes, both of the affected servers returned to the normalcy you see in the right-hand portion of the graphs above.</p>
<p>The question then was: What was going on with the <code>bash</code> processes? Unfortunately, I have no answer. The <code>bash</code> processes that spiraled out of control all belonged to a small subset of a certain department&#8217;s users and no other problematic instances of the same binary on the same hosts were reported.</p>
<p>Alert on things that are abnormalities in your particular environment. Over 2500 volume location lookups per second is not normal for us. A flat and consistent load on our OpenAFS servers is also not normal. This should have been noticed and fixed within a few hours on April 9th.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/04/20/parasitic-losses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Any-Metric Graphing with Graphite and Syslog</title>
		<link>http://www.kickflop.net/blog/2012/03/30/any-metric-graphing-with-graphite-and-syslog/</link>
		<comments>http://www.kickflop.net/blog/2012/03/30/any-metric-graphing-with-graphite-and-syslog/#comments</comments>
		<pubDate>Sat, 31 Mar 2012 02:19:00 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[DevOps]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[UNIX/Linux]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1472</guid>
		<description><![CDATA[I&#8217;ve developed a solution to the idea posed the other day. It wasn&#8217;t hard, but here&#8217;s an official write-up of the effort for what it&#8217;s worth. Who did what now? Apparently first showing its face as part of Eric Allman&#8217;s sendmail, syslog is available from the ubiquitous logger(1) UNIX/Linux command to the Log4j Java library [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve developed a solution to the idea posed <a href="http://www.kickflop.net/blog/2012/03/24/specialized-syslog-collector-for-metrics-via-syslog/">the other day</a>. It wasn&#8217;t hard, but here&#8217;s an official write-up of the effort for what it&#8217;s worth.<span id="more-1472"></span></p>
<h3>Who did what now?</h3>
<p>Apparently first showing its face as part of Eric Allman&#8217;s sendmail, <a href="http://en.wikipedia.org/wiki/Syslog">syslog</a> is available from the ubiquitous <code>logger(1)</code> UNIX/Linux command to the <a href="http://logging.apache.org/log4j/index.html">Log4j</a> Java library and everywhere in-between. Syslog provides simplistic fire-and-forget UDP[<a href="#fn1">1</a>] messages with a timestamp.</p>
<p>We&#8217;ve had a centralized syslog setup for a good 13 years now. Why not use syslog over UDP to deliver metrics due to its nearly zero barrier to entry for developers, systems, and sysadmins?</p>
<p>We&#8217;ve been very passively using <a href="http://ganglia.info">Ganglia</a> for what amounts to our largely &#8220;get something going&#8221; metric gathering effort. Having heard about <a href="http://graphite.readthedocs.org/en/latest/index.html">Graphite</a> (technically the home is <a href="http://graphite.wikidot.com/">here</a> but&#8230; ew) many times now and having played with it briefly a few months ago, I decided to see what all the hubbub was about.</p>
<p>After about 2 hours of fussing around and reading[<a href="#fn2">2</a>], I had Graphite up and functioning and&#8230; there it was. Oh, right. I have to somehow feed it data.</p>
<p>There&#8217;s this <a href="https://github.com/etsy/statsd">StatsD</a> thing I keep hearing in the same breath as Graphite. Turns out it is for (drum roll) allowing special UDP fire-and-forget metric packets to be sent to it where the daemon aggregates data before sending calculated statistics off to Graphite&#8217;s carbon backend. We don&#8217;t have a use for that right now and it&#8217;s yet another thing to run and maintain, so I&#8217;m passing it over for the time-being.</p>
<p>Right. I like using, when possible, what is widely available instead of distributing new package or library dependencies to all nodes in our infrastructure. <strong>And so, the goal is</strong>: Figure out how to get arbitrary metrics (zero configuration per new metric desired) sent to a central location via syslog and get graphs.</p>
<h3>Enter Logster</h3>
<p>It became clear from the generous replies to my <a href="http://www.kickflop.net/blog/2012/03/24/specialized-syslog-collector-for-metrics-via-syslog/">request for comments</a> that, at least for this proof of concept effort, <a href="https://github.com/etsy/logster">Logster</a> was going to play a role.</p>
<p>Logster is typically instantiated via cron with command-line options indicating a &#8220;parser&#8221; to use and a log file to watch. It determines (via its <code>logtail</code> dependency) what was last seen in the log file from the previous cron run, and feeds all new lines through the specified parser. The parser (essentially a custom plugin you write in Python) is responsible for returning instances of <code>MetricObject</code>. If you&#8217;re curious about the parsers right now, have a look at one of the samples from the source: <a href="https://github.com/etsy/logster/blob/master/parsers/SampleLogster.py">parsers/SampleLogster.py</a></p>
<pre>
syslogmaster% sudo yum install -y logcheck
syslogmaster% cd /tmp
syslogmaster% sudo wget -q --no-check-certificate -O logster.tar.gz https://github.com/etsy/logster/tarball/master
syslogmaster% sudo tar xzf logster.tar.gz
syslogmaster% cd logster*
syslogmaster% vi logster # look around for crap to change and change it
syslogmaster% make install
syslogmaster% cd parsers &#038;&#038; echo "Now write a parser..."
</pre>
<p>On our syslog master where this was installed, I cut my teeth on a simple parser for our Cyrus IMAP server, copied it to <code>/usr/share/logster</code> where it could be picked up properly, and made a per-minute cron job:</p>
<pre>
0-59 * * * * /usr/sbin/logster -o graphite --graphite-host=rcf-metrics:2003 CyrusIMAPSyslog /speaker/log/auth.log
</pre>
<p>Rad. A graph. I think this server is a little underutilized:</p>
<p><img src="http://www.kickflop.net/blog/wp-content/uploads/2012/03/imap-pop.png" alt="" title="imap-pop" width="400" height="250" class="center boxed size-full wp-image-1488" /></p>
<h3>The MetricLogster Parser</h3>
<p>Since I had Logster successfully feeding working data into Graphite&#8217;s mouth, I set out to the real task at hand and crafted my &#8220;MetricLogster&#8221; parser, which can be found in <a href="https://github.com/jblaine/logster/blob/master/parsers/MetricLogster.py">my fork</a> of the Logster repo on github (etsy/logster pull request <a href="https://github.com/etsy/logster/pull/12">12</a> has been filed).</p>
<h3>Interlude: Syslog snag!</h3>
<p>If you syslog frequently enough, you get the dreaded <code>last message repeated N times</code> lines in your log file and not actual data. I worked around this by switching our syslog master server from RHEL 5&#8242;s syslogd to a hand-build of <a href="http://www.rsyslog.com/">rsyslog</a> (which comes as default in RHEL 6). The rsyslog package has a <a href="http://www.rsyslog.com/doc/rsconf1_repeatedmsgreduction.html"><code>$RepeatedMsgReduction</code></a> global directive which one can set to <code>off</code> to always log data and never log those annoying messages.</p>
<h3>Back to The MetricLogster Parser</h3>
<p>Let&#8217;s test this thing out.</p>
<p>Generate syslog data with random numbers from bash&#8217;s builtin <code>$RANDOM</code>:</p>
<pre>
client% while :; do \
logger -t whatever -p local1.info metric=test.bash.random value=$RANDOM; \
sleep 2; \
done
</pre>
<p>On the syslog master server, instead of waiting for cron every minute (since we&#8217;re testing), run Logster every 3 seconds and feed our metric data into Graphite/carbon:</p>
<pre>
syslogmaster% while :; do \
sudo /usr/sbin/logster -o graphite --graphite-host=rcf-metrics:2003 \
                       MetricLogster /speaker/log/local1.log ; \
sleep 3; \
done
</pre>
<p>Generates this kooky but successful graph:</p>
<p><img src="http://www.kickflop.net/blog/wp-content/uploads/2012/03/bash-random.png" alt="" title="bash-random" width="400" height="250" class="center boxed size-full wp-image-1494" /></p>
<h3>Notes and Ideas</h3>
<ul>
<li>rsyslog supports input and output modules. If there was an output module for <code>rsyslog</code> that wrote directly to Graphite (technically carbon-cache or carbon-relay), we could remove Logster, its <code>logcheck</code> dependency (for <code>logtail</code>) and the custom Logster parser module (MetricLogster) from the the whole picture.</li>
<li>If <a href="https://github.com/etsy/statsd">StatsD</a> accepted well-formed basic syslog data, that would be slick too.</li>
<li>I have no idea how well Logster scales.</li>
<li>The keys in the metric syslog lines could really be shortened to <code>m</code> and <code>v</code> (from <code>metric</code> and <code>value</code>) to save a 10 bytes per log line.</li>
<li>I am an infant in this arena.</li>
<li>My statistics knowledge is 1/100th what it needs to be.</li>
<li>There needs to be a Graphite and Friends resource hub/wiki</li>
<li>Sure, this isn&#8217;t a gigantic achievement of mine, but words here are free and I&#8217;d be remiss to not say: Thanks to the <a href="http://codeascraft.etsy.com/">Etsy</a> and Orbitz engineers who have made Logster and Graphite freely available, the rsyslog developers, all of the generous people who offered comments to my previous post on this topic, and all of the noisy people I follow on Twitter who share info and/or provide me with ideas or answers weekly.</li>
</ul>
<h4>Footnotes</h4>
<ol>
<li id="fn1">Unless TCP is specifically configured as the transport where supported. Use whatever suits your needs &#8211; the information above still applies.</li>
<li id="fn2">If the graphite-dev team at Launchpad would be so kind as to let me join the club, I would happily submit patches to a lot of things I found wrong, even if I do have to learn stupid <a href="http://bazaar.canonical.com/">Bazaar</a>. It&#8217;s been a week now or so since my request to join. Do you guys want help or not? ;)</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/03/30/any-metric-graphing-with-graphite-and-syslog/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Specialized Syslog Collector for Metrics via Syslog</title>
		<link>http://www.kickflop.net/blog/2012/03/24/specialized-syslog-collector-for-metrics-via-syslog/</link>
		<comments>http://www.kickflop.net/blog/2012/03/24/specialized-syslog-collector-for-metrics-via-syslog/#comments</comments>
		<pubDate>Sat, 24 Mar 2012 15:55:33 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1460</guid>
		<description><![CDATA[I figured I would throw this out to the wolves, before working on any of the ideas, in hope of collecting experience from any who have worked with this idea before me. I&#8217;m likely to implement at least a proof of concept unless someone points out a glaring show-stopping logic flaw. Idea Collect Graphite (or [...]]]></description>
			<content:encoded><![CDATA[<p>I figured I would throw this out to the wolves, before working on any of the<br />
ideas, in hope of collecting experience from any who have worked with this<br />
idea before me. I&#8217;m likely to implement at least a proof of concept unless<br />
someone points out a glaring show-stopping logic flaw.<span id="more-1460"></span></p>
<h2>Idea</h2>
<p>Collect <a href="http://graphite.wikidot.com/">Graphite</a> (or Graphite-style) centralized metrics (anything you can think of) via <a href="http://en.wikipedia.org/wiki/Syslog">syslog</a>. We don&#8217;t need a new way to send and collect small UDP messages.</p>
<h2>Why</h2>
<p>&#8220;Why not?&#8221; is the question really.</p>
<p>&#8220;Why?&#8221; is easy:</p>
<ol>
<li>Uses as much omnipresent tooling as possible.</li>
<li>Uses established system library calls instead of requiring developers to write, install, or copy/paste new ones.</li>
<li>Because I found myself asking the following this past week: &#8220;Why am I about to build and package GNU netcat, for distribution to all of our Solaris 10 boxes, to get remarkably simple frigging UDP packets sent from shell commands?&#8221;</li>
</ol>
<h2>How: Client (Ideas)</h2>
<p>Standard syslog configuration:</p>
<pre>well-known-facility.well-known-severity        @metricshost</pre>
<p>Developers use standard syslog calls specifying the <code>well-known-facility</code> and <code>well-known-severity</code></p>
<p>For non-real-time metrics from &#8220;shell land&#8221;, just use <a href="https://www.google.com/search?q=logger+command+man+page&#038;oq=logger+command+man+page&#038;aq=f"><code>logger(1)</code></a></p>
<h2>How: Server (Ideas)</h2>
<p>Tweak rsyslog or other well-known syslog &#8220;collector&#8221; product to:</p>
<ol>
<li>Parse basic Graphite or Graphite-like metric messages and perform RRD and/or   Whisper writes.  Additionally could implement built-in stats aggregation stuff like StatsD quite easily.</li>
<li>Ignore all non-metric-conforming syslog data</li>
</ol>
<p>Thanks for any thoughts below or via the <a href="https://groups.google.com/d/topic/devops-toolchain/zUYaAEarMv4/discussion">original thread</a> that links here.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/03/24/specialized-syslog-collector-for-metrics-via-syslog/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>256-color support under GNU screen, including non-GUI vim</title>
		<link>http://www.kickflop.net/blog/2012/03/02/256-color-support-under-gnu-screen-including-non-gui-vim/</link>
		<comments>http://www.kickflop.net/blog/2012/03/02/256-color-support-under-gnu-screen-including-non-gui-vim/#comments</comments>
		<pubDate>Sat, 03 Mar 2012 02:57:23 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1450</guid>
		<description><![CDATA[Finally, I got 256-color support under GNU screen, including vim! I really don&#8217;t know how or why this was so difficult for me to get working, but it was a royal time sink. I read just about every thread everywhere about anything related to this topic. All I can say is: This worked for me. [...]]]></description>
			<content:encoded><![CDATA[<p>Finally, I got 256-color support under GNU screen, including vim!  I really don&#8217;t know how or why this was so difficult for me to get working, but it was a royal time sink.  I read just about every thread everywhere about anything related to this topic.</p>
<p>All I can say is: This worked for me.</p>
<p>I&#8217;ve turned off comments on this post, because I don&#8217;t want to even begin to suggest that I will have any answers to any questions you may ask.</p>
<pre>
# .bash_profile
#
# I recursively copied /usr/share/lib/terminfo on a modern
# Linux box to $HOME/.terminfo so that I could have modern
# stuff with me wherever I go, like Solaris 10 which has
# no modern 256 color crap.  Make sure you have the screen-256color
# terminfo stuff.  Then...
TERMINFO=$HOME/.terminfo
# xterm-256color should work below as well, but since I am
# always connecting from PuTTY, I use this which is technically
# more correct.
TERM=screen-256color
</pre>
<pre>
# .screenrc portion for GNU screen which MUST BE compiled with 256
# color support.
#
term "screen-256color"
</pre>
<pre>
" .vimrc portion
"
" "People" say this should never be required if your terminfo crap is
" correct, but it is required for me *when running vim under GNU screen*
set t_Co=256

" Enable syntax highlighting
syntax enable

" Use whatever you like here
colorscheme lucius

" Dunno, this seems to be the only thing that leaves my terminal in a
" proper state once I exit vim in 256-color mode under GNU screen
" when using either TERM=putty-256color or TERM=xterm-256color.  Found it
" mentioned in some IRC log after digging through Google results for an
" hour or more.
set t_ti= t_te=
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/03/02/256-color-support-under-gnu-screen-including-non-gui-vim/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wait, what?</title>
		<link>http://www.kickflop.net/blog/2012/02/15/wait-what/</link>
		<comments>http://www.kickflop.net/blog/2012/02/15/wait-what/#comments</comments>
		<pubDate>Wed, 15 Feb 2012 22:19:25 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[UNIX/Linux]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1444</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p><script src="https://gist.github.com/1731804.js?file=gistfile1.txt"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/02/15/wait-what/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Django: Taking note of ManyToManyField changes</title>
		<link>http://www.kickflop.net/blog/2012/01/18/django-taking-note-of-manytomanyfield-changes/</link>
		<comments>http://www.kickflop.net/blog/2012/01/18/django-taking-note-of-manytomanyfield-changes/#comments</comments>
		<pubDate>Thu, 19 Jan 2012 02:05:36 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Django]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1417</guid>
		<description><![CDATA[I&#8217;ve gone through a bit of a mess lately, trying to do something I considered very possible, only to ultimately fail to find a working solution. This write-up will hopefully save someone else several hours of effort, asking for help, waiting, etc. Those reading the article, please be sure to also read any comments to [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve gone through a bit of a mess lately, trying to do something I considered very possible, only to ultimately fail to find a working solution.  This write-up will hopefully save someone else several hours of effort, asking for help, waiting, etc.</p>
<p>Those reading the article, please be sure to also read any comments to the article, as others may have additional ideas, solutions, or helpful information.</p>
<p>As you can see from the documentation links in this post, I am working with version 1.3.  I am not aware of any differences in 1.3.1 or 1.4.0 regarding the topic covered in this post.</p>
<h2>Scenario</h2>
<p>You have a <a href="https://docs.djangoproject.com/en/1.3/topics/db/models/">Django model</a>. That model has a field which is a <a href="https://docs.djangoproject.com/en/1.3/ref/models/fields/#manytomanyfield">ManyToManyField</a>. When an instance of your model is modified via any mechanism (custom web form, Django admin interface, code using the ORM, etc), you want to execute some custom code based on what happened to the model instance&#8217;s fields (including the ManyToManyField).</p>
<p>Specifically, what if you need to keep a non-Django source in sync with a Django model&#8217;s data? In my case, I have a &#8220;simple&#8221; need to push changes to a Django model&#8217;s data to an LDAP server.<span id="more-1417"></span></p>
<pre>
class Interface(models.Model):
    fqdn = models.CharField(primary=True,
                            verbose_name="Fully Qualified Domain Name",
                            max_length=80)

class Netgroup(models.Model):
    name = models.CharField(primary=True,
                            max_length=80)

    interfaces = models.ManyToManyField(model=Interface,
                                        null=True,
                                        blank=True)
</pre>
<p>Did <code>NetgroupInstance.interfaces</code> change? What changed? Was 1 interface removed from 500?  What was that 1 interface? Were 9 interfaces added to the existing 15 interfaces? What were the 9 interfaces?  Was this a brand new Netgroup creation with interfaces added at creation-time?  What were the interfaces?</p>
<h2>Solution 1?</h2>
<p>Just override the <code>Netgroup</code> save() method! Call the superclass&#8217;s save first, then run your custom code against the saved data.</p>
<p><strong>FAILS:</strong> The ManyToManyField data is not reliable at this point in the code flow for some reason. See the first few posts in <a href="http://groups.google.com/group/django-users/browse_thread/thread/8a179538637b2648">my thread here</a>.</p>
<h2>Solution 2?</h2>
<p>Maybe try Django&#8217;s <a href="https://docs.djangoproject.com/en/1.3/topics/signals/">signals</a>?  Register your custom code as a callback function for the Django <code>post_save</code> signal?</p>
<p><strong>FAILS:</strong> Does not work properly in the &#8216;Admin&#8217; interface at a minimum. Creating a new <code>Netgroup</code> with some <code>Interface</code> relations selected does not show the <code>Interface</code> relations inside the post_save callback function. Another similar case to above. See the <a href="http://groups.google.com/group/django-users/browse_thread/thread/b5c9ceb05ae3d178">final post here</a> (as of 1/18/2012).</p>
<h2>Solution 3?</h2>
<p>Try solution 2 in combination with another signal callback function for the <code><a href="https://docs.djangoproject.com/en/1.3/ref/signals/#m2m-changed">m2m_changed</a></code> Django signal. This was recommended to me in #django-users IRC as the way to do it.</p>
<p><strong>FAILS:</strong> Does not work properly in the &#8216;Admin&#8217; interface. Removing <code>Interface</code> relationships does not present itself as data associated with the <code>pre_remove</code> and/or <code>post_remove</code> &#8220;actions&#8221;. Instead, it appears the &#8216;Admin&#8217; interface clears all interfaces from the field and then adds the correct interfaces back in. See <a href="https://code.djangoproject.com/ticket/16073">bug 16073</a>, <a href="https://code.djangoproject.com/ticket/14482">bug 14482</a>, and this <a href="https://groups.google.com/forum/#!topic/django-developers/27PofpUfR_0">thread</a>.</p>
<h2>Sad Conclusion</h2>
<p>If there wasn&#8217;t a bug, we would obviously use solution 3, even though I find it goofy to have to catch an extra signal (<code>m2m_changed</code>) when a model happens to have a ManyToManyField on it.</p>
<p>As it is now, we just use &#8220;Solution 2&#8243; and tell users of the &#8216;Admin&#8217; interface, via extra <code><a href="https://docs.djangoproject.com/en/1.3/ref/forms/fields/#help-text">help_text</a></code> on the <code>Netgroup.name</code> field, to enter the name, save, <em>then</em> add items to the Netgroup instance.</p>
<pre>
class Interface(models.Model):
    fqdn = models.CharField(primary=True,
                            verbose_name="Fully Qualified Domain Name",
                            max_length=80)

class Netgroup(models.Model):
    name = models.CharField(primary=True,
                            max_length=80,
                            help_text="WARNING: save new netgroups with just the name, then add interfaces and save again.  Sorry!")

    interfaces = models.ManyToManyField(model=Interface,
                                        null=True,
                                        blank=True)

from django.db.models.signals import post_delete, post_save, m2m_changed

def netgroup_delete(sender, **kwargs):
    pass # DO STUFF HERE

def netgroup_save(sender, **kwargs):
    pass # DO STUFF HERE

post_delete.connect(netgroup_delete, sender=Netgroup,
                    dispatch_uid="netgroup_delete")
post_save.connect(netgroup_save, sender=Netgroup,
                  dispatch_uid="netgroup_save")
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2012/01/18/django-taking-note-of-manytomanyfield-changes/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Thunderbird 8 rewrote my subject?</title>
		<link>http://www.kickflop.net/blog/2011/12/14/thunderbird-rewrote-my-subject/</link>
		<comments>http://www.kickflop.net/blog/2011/12/14/thunderbird-rewrote-my-subject/#comments</comments>
		<pubDate>Wed, 14 Dec 2011 17:41:46 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Quality Control]]></category>
		<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1403</guid>
		<description><![CDATA[I have no Thunderbird extensions installed. I sent this message (in blue below) with the subject shown. I received it too, as I was a recipient. I then selected it (as you see above) and replied to it (replying to my own message, to add new info). I didn&#8217;t touch the subject line. I sent [...]]]></description>
			<content:encoded><![CDATA[<p>I have no Thunderbird extensions installed.</p>
<p>I sent this message (in blue below) with the subject shown.  I received it too, as I was a recipient.</p>
<p><a rel="lightbox" href="http://www.kickflop.net/blog/wp-content/uploads/2011/12/thunderbird-odd.jpg"><img src="http://www.kickflop.net/blog/wp-content/uploads/2011/12/thunderbird-odd-300x46.jpg" alt="" title="thunderbird-odd" width="300" height="46" class="center size-medium wp-image-1406" /></a></p>
<p>I then selected it (as you see above) and replied to it (replying to my own message, to add new info).  I didn&#8217;t touch the subject line.  I sent the new reply out.  I receieved it too, as I was a recipient.  Here is how I received it.  Take note of the changed subject.</p>
<p><a rel="lightbox" href="http://www.kickflop.net/blog/wp-content/uploads/2011/12/thunderbird-odd2.jpg"><img src="http://www.kickflop.net/blog/wp-content/uploads/2011/12/thunderbird-odd2-300x46.jpg" alt="" title="thunderbird-odd2" width="300" height="46" class="center size-medium wp-image-1405" /></a></p>
<p><strong>Update</strong>: Testing shows that the subject is being altered by Thunderbird as soon as I click on the original message and hit reply.  The reply window appears with the subject altered.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2011/12/14/thunderbird-rewrote-my-subject/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Never assume customers will register complaints</title>
		<link>http://www.kickflop.net/blog/2011/11/21/never-assume-customers-will-register-complaints/</link>
		<comments>http://www.kickflop.net/blog/2011/11/21/never-assume-customers-will-register-complaints/#comments</comments>
		<pubDate>Mon, 21 Nov 2011 21:46:55 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Sysadmin]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1400</guid>
		<description><![CDATA[We&#8217;re an enterprisey shop. Our department has no public-facing services. Our shop&#8217;s customers are engineer employees at our company, and you could basically consider our department a large lab operations group at a deep-but-slow think tank. This will be a pretty moronic post to anyone involved in providing performant web services for a salary, but [...]]]></description>
			<content:encoded><![CDATA[<p><em>We&#8217;re an enterprisey shop.  Our department has no public-facing services.  Our shop&#8217;s customers are engineer employees at our company, and you could basically consider our department a large lab operations group at a deep-but-slow think tank.  This will be a pretty moronic post to anyone involved in providing performant web services for a salary, but I&#8217;m always readily willing to look moronic to tell a story.</em></p>
<p>As an aside, in an unrelated ticket, a customer mentioned slow page load times for one of the pages served by our GForge Advanced Server instance.  He indicated that this particular page always takes nearly 20 seconds to load and offered that it might be tied to the fact that he is a member of roughly 20 projects hosted via that GForge server.</p>
<p>This aside from the customer was a remarkable coincidence, as I had created an internal task ticket 2-3 weeks ago for us to implement metrics gathering for this service because &#8220;we should&#8221;.  We already were performing the most basic of up/down monitoring for the host and service via Nagios.  Now we get to have a problematic baseline of metrics and watch things improve from here.</p>
<p>After tweaking our Apache logging to log request service time (<code>%D</code> via <a href="http://httpd.apache.org/docs/2.0/mod/mod_log_config.html">mod_log_config</a>, we noticed some (too many) problematic pages for certain users and projects.  One of those pages was, of course, the page the customer had reported.</p>
<p>So far, we&#8217;ve instrumented metric gathering for each block of PHP (&#8230;) code in the most commonly accessed problematic page and tracked down the specific section where the slowness happens.  The metric gathering is simplistic: for each major block of execution in the PHP file, store a start time, store an end time, and calculate total seconds to execute that block.  Finally, <code>syslog()</code> the accumulated metrics as one line.</p>
<p>Anyway, that&#8217;s not the point of this post.  The point is: <strong>Never stand up a service that works overall and assume your users will complain of slowness.  Turns out 10-20 users have been quietly suffering through seriously long page load times for years now without saying a word to us.</strong>  According to other admins in other departments (who responded to our nice-but-essentially &#8220;WTF people?&#8221; message to all customers of the GForge service), they&#8217;ve experienced the same phenomenom.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2011/11/21/never-assume-customers-will-register-complaints/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On Pithy Bullshit Analogies</title>
		<link>http://www.kickflop.net/blog/2011/11/14/on-pithy-bullshit-analogies/</link>
		<comments>http://www.kickflop.net/blog/2011/11/14/on-pithy-bullshit-analogies/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 20:05:51 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Sysadmin]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1397</guid>
		<description><![CDATA[I&#8217;ve had it with the pithy bullshit analogies, especially via Tweets. Belittling those who productively scrutinize and debate similar software offerings is LAME. I&#8217;m so sick of hearing people dismiss &#8220;Chef vs. Puppet&#8221; discussions (and hundreds of other very valid comparison-style blog posts, etc) as worthless, un-enlightened, and several other holier-than-thou attributions, as if the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had it with the pithy bullshit analogies, especially via Tweets.</p>
<p>Belittling those who productively scrutinize and debate similar software offerings is LAME.</p>
<p>I&#8217;m so sick of hearing people dismiss &#8220;Chef vs. Puppet&#8221; discussions (and hundreds of other very valid comparison-style blog posts, etc) as worthless, un-enlightened, and several other holier-than-thou attributions, as if the content is a simplistic bantering about Coke vs. Pepsi.</p>
<p>My camel&#8217;s back broke when John Allspaw (<a href="https://twitter.com/#!/allspaw">@allspaw</a>) recently tweeted:</p>
<blockquote><p>Sometimes I imagine web engineers as carpenters; too busy being angry about each other&#8217;s brand of hammers to actually build the damn house.</p></blockquote>
<p>Greg Fodor (<a href="https://twitter.com/#!/gfodor">@gfodor</a>) replied with:</p>
<blockquote><p>amateurs argue over tools. journeymen argue over techniques. experts argue if we should be building the house in the first place.</p></blockquote>
<p>Both of these statements are nothing more than horriffic broccoli farts with the hive-minded in the DevOps world apparently smelling them as roses.  I don&#8217;t know Greg Fodor, but I&#8217;m pretty positive John Allspaw is a lot more intelligent than that bullshit tweet of his, and he should be concerned about saying what he actually means, in detail.  Because, you know, people listen to him.</p>
<p>Decisions are all based on something.  Just because you don&#8217;t need to bother yourself with making the &#8220;small&#8221; decisions anymore doesn&#8217;t give you a new license to be an asshole toward those involved in discussions about those &#8220;small&#8221; decisions.  Those people making the &#8220;small&#8221; decisions now, with debates and careful evaluations surrounding them, MAKE SHIT ULTIMATELY HAPPEN.  I understand that everyone, more and more, just wants stuff to magically happen, but the reality still is that things need to be planned, discussed, scrutinized, evaluated, and finally decided on.  Don&#8217;t get pissy because that part still has to happen and your Idea-to-Realization time can&#8217;t be measured in minutes.</p>
<p>God damn.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2011/11/14/on-pithy-bullshit-analogies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graphite for RHEL 5.7</title>
		<link>http://www.kickflop.net/blog/2011/11/01/graphite-for-rhel-5-7/</link>
		<comments>http://www.kickflop.net/blog/2011/11/01/graphite-for-rhel-5-7/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 01:14:54 +0000</pubDate>
		<dc:creator>JB</dc:creator>
				<category><![CDATA[Sysadmin]]></category>

		<guid isPermaLink="false">http://www.kickflop.net/blog/?p=1392</guid>
		<description><![CDATA[Graphite install for RHEL 5.7 from scratch. Official Graphite documentation is at http://graphite.readthedocs.org/ export PATH=/usr/bin:/bin:/usr/sbin:/sbin mkdir -p /graphite/src cd /graphite/src # Download Python 2.7.x source from python.org # Unpackage cd Python-2.7.2 ./configure --prefix=/graphite make make install cd .. # Now make sure you get your python before RHEL's export PATH=/graphite/bin:$PATH # Download Django 1.x source [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://graphite.wikidot.com/">Graphite</a> install for RHEL 5.7 from scratch.</p>
<p>Official Graphite documentation is at <a href="http://graphite.readthedocs.org/">http://graphite.readthedocs.org/</a></p>
<pre>
export PATH=/usr/bin:/bin:/usr/sbin:/sbin
mkdir -p /graphite/src
cd /graphite/src

# Download Python 2.7.x source from python.org
# Unpackage

cd Python-2.7.2
./configure --prefix=/graphite
make
make install
cd ..

# Now make sure you get your python before RHEL's

export PATH=/graphite/bin:$PATH

# Download Django 1.x source from django.org
# Unpackage

cd Django-1.3
python setup.py install
cd ..

# Download django-tagging from http://code.google.com/p/django-tagging/
# Unpackage

cd django-tagging-0.3.1
python setup.py install
cd ..

# Download Twisted 11.x source here from http://twistedmatrix.com/
# Unpackage

cd Twisted-11.0.0
python setup.py install
cd ..

# Download setuptools from http://pypi.python.org/pypi/setuptools
# Unpackage

cd setuptools-0.6c11
python setup.py install
cd ..

# Download pip from http://pypi.python.org/pypi/pip
# Unpackage

cd pip-1.0.2
python setup.py install
cd ..

# Now you have to build the Cairo graphics library, as RHEL's
# is too old for py2cairo's needs.

# Start with pixman, a dependency for building cairo!

# Download pixman from http://www.cairographics.org/releases/
# Unpackage

cd pixman-0.22.2
./configure --prefix=/graphite
make
make install

# Download cairo from http://www.cairographics.org/releases/
# Unpackage

cd cairo-1.10.2
# Make sure we find our shit and don't have runtime linker problems
export LDFLAGS="-L/graphite/lib -Xlinker -rpath -Xlinker /graphite/lib"
# PKG_CONFIG_PATH is set below so pixman is found
export PKG_CONFIG_PATH=/graphite/lib/pkgconfig
# Turn off all the X shit and gobject since RHEL glib2 is too old
./configure --prefix=/graphite --without-x --enable-xlib=no \
    --enable-xlib-xrender=no --enable-xcb-shm=no --enable-qt=no \
    --enable-gl=no --enable-gobject=no
make
make install

# Download Pycairo (aka py2cairo) from http://www.cairographics.org/pycairo/
# Unpackage

# Had to add these to LDFLAGS or ./waf configure bombs on Python.h test
# due to not including them when it should be in its test
export LDFLAGS="$LDFLAGS -lm -ldl -lutil"
cd py2cairo-1.10.0
./waf configure --prefix=/graphite
./waf build
./waf install
cd ..

# Download carbon, whisper, and graphite-web from
# https://launchpad.net/graphite
# Unpackage

cd whisper-0.9.9
python setup.py install
cd ..

cd carbon-0.9.9
python setup.py install
cd ..

cd graphite-web-0.9.9
./check-dependencies.py # Ignore WARNINGs as you see fit
python setup.py install
cd ..
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.kickflop.net/blog/2011/11/01/graphite-for-rhel-5-7/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

