<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Will Partain&apos;s work blog</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/" />
    <link rel="self" type="application/atom+xml" href="http://blogs.verilab.com/partain/atom.xml" />
    <id>tag:blogs.verilab.com,2009-01-06:/partain//2</id>
    <updated>2010-05-17T09:15:25Z</updated>
    <subtitle>IT chat from Verilab</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.21-en</generator>

<entry>
    <title>Wish: shared, fast, synchronized, managed, cloudy folders</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/05/wish-shared-fast-synchronized-managed-folders.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.100</id>

    <published>2010-05-17T00:23:45Z</published>
    <updated>2010-05-17T09:15:25Z</updated>

    <summary>I can&apos;t believe how hard the following problem is turning out to be. We are a small company with people hither and yon (for various reasons) who need to see/work-on the same &quot;documents&quot;. In a normal Windows Explorer kinda way...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I <em>can't believe</em> how hard the following problem is turning out
to be.</p>

<p>We are a small company with people hither and yon (for various
reasons) who need to see/work-on the same "documents".  In a
normal Windows Explorer kinda way (not some
over-there-in-the-browser kinda way) and with (near-)LOCAL DISK
speed.  (I do not want to hear the word 'WebDAV' ever again.)</p>

<p>Yes, they always need to see the "latest" version at all times.
(No, they do not want to click an 'Update' button.)</p>

<p>They need to be told "Whoa!  Two of you are trying to write that
thing at once!"</p>

<p>Different (changing) groups of people need to see different
groups of documents.  Only finance people can see finance
documents, etc.</p>

<p>An "admin" needs to be able to login, get some oversight of
things, and change stuff, e.g. shut off the person who just left
the company.</p>

<p>Windows, Macs and Linux, please.  (OK, OK... we're flexible.)</p>

<p>I would have thought the above would be the <em>FIRST THING</em>
any "cloud storage" wannabee might turn his hand to.</p>

<p>Obvious "solutions": NONE (that I've found).  More as I find it.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>April Fool FAIL</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/04/april-fool-fail.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.99</id>

    <published>2010-04-05T00:23:45Z</published>
    <updated>2010-04-05T09:41:31Z</updated>

    <summary>I attempted an April Fools&apos; Day gag at work, and FAILED. The failure is IT-instructive. Now, I don&apos;t like April 1 jokes that are apparent just from the title; but I erred too far on the side of obscurity. A...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I attempted an April Fools' Day gag at work, and FAILED.  The failure
is IT-instructive.</p>

<p>Now, I don't like April 1 jokes that are apparent just from the
title; but I erred too far on the side of obscurity.</p>

<p>A few weeks ago, we had email chat at work about some of the
weirdness you see on your router if Skype is running.  I mused
that I would prefer that people <em>who do not use Skype</em> should
turn it off.  Each time I was reminded that Skype was a major
communication tool within Verilab.</p>

<p>So last Thursday I "announced" that a Skype ban would be rolled
out starting with people "from the middle of the alphabet".
People who don't even work in that office were "banned".  To summarize:</p>

<ul>
<li><p>The "ban" was high-handed and made entirely without consultation.</p></li>
<li><p>No alternative was proposed, making the "ban" extremely business-hostile.</p></li>
<li><p>The "analysis" on which the "ban" was based -- we occasionally
see bursts of UDP packets for a minute or two -- was,
<em>errmm...</em> "limited" to say the least.</p></li>
<li><p>"Starting in the middle of the alphabet" -- say what?</p></li>
<li><p>Summary: COMPLETELY INSANE.</p></li>
</ul>

<p>What's interesting is that a COMPLETELY INSANE proposal from the
IT guy is taken as quite normal.  I was asked polite questions,
and people seemed to be trying to adapt.  I don't know whether my
colleagues intended to ignore me, work around me, or what.  (Of
course, it may be a double-bluff and the joke's on me.)</p>

<p>Perhaps we see so much COMPLETELY INSANE stuff in all the
organizations we bump up against in everyday life that it seems
normal for the IT guy to trash the company's communication structure.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Windows Tedious Transfer (thanks, Microsoft)</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/04/windows-tedious-transfer-thanks-microsoft.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.98</id>

    <published>2010-04-02T00:23:45Z</published>
    <updated>2010-04-05T09:41:23Z</updated>

    <summary>I am in the army of Linux-only people who &quot;deals with&quot; friends&apos; Windows PCs, thus making Microsoft look better than they deserve and Bill Gates richer thanks to no effort on his part. Microsoft&apos;s apparent attitude to this sullen army...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I am in the army of Linux-only people who "deals with" friends'
Windows PCs, thus making Microsoft look better than they deserve
and Bill Gates richer thanks to no effort on his part.</p>

<p>Microsoft's apparent attitude to this sullen army of volunteer
helpers: No good deed goes unpunished.</p>

<p>Case in point.  Some friends' PC died; they got a new one; the
disk out of the old one was fine and readily sprang to life in a
USB external enclosure.  They'd like to move their old stuff over
to the new machine.</p>

<p>That would be a few minutes' easy work under Linux (but this is
Windows).</p>

<p>Ah!  But what's this?!  "Windows Easy Transfer" [WET] (under Windows 7)
-- "Helps you transfer personal files, e-mail, data, files,
media, and settings from your old computer to the new one."
<em>Fantastic</em>!  And it can read from an external USB drive -- glory be!</p>

<p>Sadly, you have to have "prepared" for the "Easy Transfer" by
running WET on the old machine first -- presumably before the
puff of smoke and the acrid smell in the air.</p>

<p><em>SIGH</em>.  And does WET have any sort of second-best fallback
option? e.g. "We can't transfer your user accounts, but we can
at least move some documents over for you?"  (You know, the
equivalent of a one-line shell script.)</p>

<p>No, of course not.  (What, exactly, do all of those "programmers"
in Redmond do all day every day?)</p>

<p>So I'm doing Windows Tedious Transfer [WTT].  Copying
files/folders around, deleting Obviously Useless things
(e.g. cookie files), and so on.  I.e. doing a third-rate job
taking eight times as long.</p>

<p>No good deed goes unpunished.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>LVM within LVM?</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/03/partitioning-for-virtualization-raid-and-lvm-1.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.97</id>

    <published>2010-03-22T00:23:45Z</published>
    <updated>2010-03-22T15:23:14Z</updated>

    <summary>I wrote about how to arrange disks on a host that needs to support RH-style virtual machines (VMs) in a flexible way. And what should you do on the guest (the VM itself)? Well, following the cited scheme, the VM...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I wrote about <a href="http://blogs.verilab.com/partain/2010/03/partitioning-for-virtualization-raid-and-lvm.html">how to arrange disks</a>
on a host that needs to support RH-style virtual machines (VMs)
in a flexible way.</p>

<p>And what should you do on the guest (the VM itself)?</p>

<p>Well, following the cited scheme, the VM will be handed an
LVM "logical volume" to use.</p>

<p>The VM will treat those bits <em>as a disk</em> (as it would if you
created a big file and handed that over, instead).  Your only
real choice is to <em>partition</em> that virtual disk, and use the
resulting partitions in the usual ways.</p>

<p>If you know exactly how to use the VM's disk space, and "it will
never change", then create fixed partition(s), slap filesystem(s)
on them, and you're done.</p>

<p>If, however, we again assume a need for "flexibility", then LVM
is <em>again</em> a good idea -- yes, LVM(guest)-within-LVM(host).</p>

<p>I have been told (but not tried myself) that LVM on the host can
"see" the inner-LVM stuff (the guest's) and do things with it,
e.g. snapshot it.</p>

<p>(Apparently important: the volume group(s) on the guest must have
different names from the volume group(s) on the host.)</p>

<p>But the main virtue of LVM-within-LVM is that it's relatively
simple to increase the "disk space" on the guest...</p>

<p>Stop the VM: <code>virsh shutdown guest</code></p>

<p>On the host, extend the logical volume that is the guest's "disk":</p>

<pre><code>lvextend -L20G /dev/vg0/guest-disk
</code></pre>

<p>Restart the VM: <code>virsh start guest</code></p>

<p>The guest's "disk" just got magically bigger.  Use <code>fdisk</code> (or
equiv) to make a new "partition" out of the extra space, giving
it the LVM type (<code>8e</code>).</p>

<p>Make the new "partition" an LVM "physical volume" (PV):</p>

<pre><code>pvcreate /dev/xvda3
</code></pre>

<p>Add the PV to the VM's "volume group" (VG):</p>

<pre><code>vgextend guestvg0 /dev/xvda3
</code></pre>

<p>Now you can make one of the "inner" (guest) "logical volumes" (LVs) bigger:</p>

<pre><code>lvextend /dev/guestvg0/lvroot ...
</code></pre>

<p>Finally, the filesystem sitting on that LV will need resizing:</p>

<pre><code>resize2fs /dev/guestvg0/lvroot
</code></pre>
]]>
        

    </content>
</entry>

<entry>
    <title>Partitioning for virtualization: RAID and LVM</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/03/partitioning-for-virtualization-raid-and-lvm.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.96</id>

    <published>2010-03-16T00:23:45Z</published>
    <updated>2010-03-16T12:04:54Z</updated>

    <summary>If you&apos;re going to use a Linux box for Red Hat-style virtualization (Xen/KVM), how should you use/partition your disk(s)? Here&apos;s one answer. First: disks are cheap enough that I always use two, and software-RAID-1 them together (mirroring). Plan to do...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>If you're going to use a Linux box for Red Hat-style
virtualization (Xen/KVM), how should you use/partition your disk(s)?
Here's one answer.</p>

<p>First: disks are cheap enough that I always use two, and
software-RAID-1 them together (mirroring).  Plan to do that.</p>

<p>Second: Though I'm no fan of LVM (the Logical Volume Manager) --
extra complication for no gain -- it's a <em>good choice</em> for
"host for a bunch of VMs (virtual machines)".  I'm assuming you
don't know exactly which VMs of what size, so the flexibility of
LVM is actually useful.</p>

<p>Also, as disks get larger, it's harder to predict in advance how
you'll end up using all that space.  So, again, the flexibility
of LVM may benefit.</p>

<p>Now, I <em>don't</em> use LVM for the basic filesystems (<code>/boot</code> and <code>/</code> [root]).
Their sizes are predictable and, when It Goes Horribly Wrong (TM),
you'll be really glad not to have LVM in the way.  So I partition
my raw disks thusly:</p>

<pre><code>Filesystem  Size  Used for
/dev/md0    300M  /boot
/dev/md1     16G  /
swap         ???
/dev/md2    rest  of the disk(s)
</code></pre>

<p>That <code>/dev/md2</code> goes straight into LVM.  Make a PV (Physical Volume):</p>

<pre><code>pvcreate /dev/md2
</code></pre>

<p>And create a Volume Group (let's call it <code>vg0</code>) with it:</p>

<pre><code>vgcreate -s 32M vg0 /dev/md2
</code></pre>

<p>(That <code>-s 32M</code> is not particularly well-informed.)</p>

<p>Now you can make LVs (Logical Volumes) to your heart's content...</p>

<pre><code>lvcreate -n vm001-disk --size 10G vg0
</code></pre>

<p>... and feed them to your VM-creating software.  (Or make things
for your host machine, e.g. a varying-sized <code>/var</code>.)</p>

<p>A word on LVM-on-RAID... LVM has a mirroring facility of its own,
so why use software RAID1?</p>

<p>First, I'm already paying the software-RAID tax with my non-LVM
partitions (<code>/boot</code> and <code>/</code>).</p>

<p>Second: For me, software RAID is familiar and has been reliable.</p>

<p>Third: Software RAID-1 is less scary in a crisis.  I like things
like: You can take a disk that is half of a RAID-1 mirror, throw
it in an enclosure, and mount it as a filesystem, no problem.</p>

<p>Fourth: The LVM mirroring guff seemed more complicated and/or
easier to get wrong.  (Just unfamiliarity, maybe.)</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Slaying the /bin/rm dragon? (/usr/bin/ionice)</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/02/slaying-the-binrm-dragon-usrbinionice.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.95</id>

    <published>2010-02-02T01:23:45Z</published>
    <updated>2010-02-02T10:34:11Z</updated>

    <summary>Further to my previous battles with /bin/rm, something new-to-me that might help: ionice -- &quot;get/set program I/O scheduling class and priority&quot;. (Linux-only, I think.) Running ionice -c3 /bin/rm ... should mean that rm &quot;will only get disk time when no...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>Further to <a href="http://blogs.verilab.com/partain/2010/01/my-toughest-linux-foe-yet-binrm.html">my previous battles with /bin/rm</a>,
something new-to-me that might help: <code>ionice</code> -- "get/set program
I/O scheduling class and priority".  (Linux-only, I think.)</p>

<p>Running <code>ionice -c3 /bin/rm ...</code> should mean that <code>rm</code> "will only
get disk time when no other program has asked for disk I/O..."
(man page).  That sounds about right.</p>

<p>Of course, <code>ionice</code> will not stop seek storms, nor will it leave
the disk heads in a place convenient for other programs that are
trying to do Real Work (TM).</p>

<p>NB: I have no empirical evidence that <code>ionice</code> actually helps
in cases where <code>/bin/rm</code> renders a machine otherwise useless.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>My toughest Linux foe yet (/bin/rm)</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2010/01/my-toughest-linux-foe-yet-binrm.html" />
    <id>tag:blogs.verilab.com,2010:/partain//2.94</id>

    <published>2010-01-21T01:23:45Z</published>
    <updated>2010-01-21T19:27:34Z</updated>

    <summary>Who woulda thunk it? I have a new-in-the-last-year Linux box, large-ish disk (750GB), mostly one big ext3 filesystem, and -- OK, shoot me... -- a bunch of many-rsync&apos;d-hard-links directories therein. I made the disastrous choice to remove some of those...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>Who woulda thunk it?  I have a new-in-the-last-year Linux box,
large-ish disk (750GB), mostly one big <code>ext3</code> filesystem, and --
OK, shoot me... -- a bunch of many-rsync'd-hard-links directories
therein.</p>

<p>I made the disastrous choice to remove some of those directories,
you know... <code>/bin/rm -rf *.OLD</code>.</p>

<p>Two weeks later, the job isn't finished.</p>

<p>The job probably would be finished if I had exclusive use of the
machine.  But the other users get stroppy when I render the
machine useless to them with my exotic Power Tool, <code>/bin/rm</code>.
The load average runs to 5, the CPU romps up to 98+% IO waiting,
and the other users start calling my number.  Not in a nice way.</p>

<p>I would have thought that removing files-and-folders was about
the most settled/debugged/worked-out part of Linux.  How naive I
obviously am.</p>

<p>My situation, or one very like it, was discussed on the Linux
kernel list: <a href="http://fixunix.com/kernel/347494-very-poor-ext3-write-performance-big-filesystems.html">"Very poor ext3 write performance on big filesystems?"</a>
(Probably elsewhere, too.)  You can read it for yourself.</p>

<p>The first thing I did was make a script <code>rm-and-sleep</code>; not
rocket science:</p>

<pre><code>#!/bin/sh
/bin/rm ${1+"$@"}
sleep 5
</code></pre>

<p>I can then run that with something like...</p>

<pre><code>find . -type f -print | xargs -n 50 rm-and-sleep
</code></pre>

<p>... and be sure I am not eating all of the system all of the
time.  (Just seeing the script makes me want to weep... decades
of computing progress, and we're doing <em>this</em>?)</p>

<p>One thing suggested in the discussion cited above is to remove
things in inode order.  I ended up using <code>cut</code> rather than
<code>awk</code> in <a href="http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg268617.html">Chris Mason's script</a>,
giving:</p>

<pre><code>find . \( \! -type d \) -printf "%i %p\n" \
| sort -n \
| cut -d ' ' -f 1 --complement \
| xargs -n 500 rm-and-sleep
</code></pre>

<p>Guff to deal with spaces-in-filenames and <code>nice</code> and so forth --
omitted for brevity.</p>

<p>Ugh.  Ugh.  Ugh.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>My time in an MTU black hole (?)</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/11/my-time-in-an-mtu-black-hole.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.93</id>

    <published>2009-11-28T01:23:45Z</published>
    <updated>2009-11-28T17:13:44Z</updated>

    <summary>Weirdest network symptom for me in a long time... An ssh -v to one of our remote hosts would hang every time with: debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY The only hint in the apparently-most-relevant online discussion is that it might have something...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>Weirdest network symptom for me in a long time... An <code>ssh -v</code>
to one of our remote hosts would hang every time with:</p>

<pre><code>debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
</code></pre>

<p>The only hint in the <a href="http://old.nabble.com/SSH-connections-stall-expecting-SSH2_MSG_KEX_DH_GEX_REPLY-td16603314.html">apparently-most-relevant online discussion</a>
is that it might have something to do with some network
interface's MTU (Maximum Transmission Unit).</p>

<p>Lacking any better ideas, we changed the MTU on the SSH server
(with <code>ifconfig eth0 mtu 512</code>) -- note: a value <em>wildly</em> lower
than what should be required -- and it worked!  SSH connections
could happen.</p>

<p>Except: only just barely.  The first time the remote end tried to
send any amount of data -- e.g. a VNC screen refresh -- the
connection would wedge, and that would be that.  (I cannot
explain this data point: I would expect the very low MTU setting
to ensure that <em>no</em> big packets ever got out, and so no wedging
could happen on that account.)</p>

<p>On the weak assumption that our problem was something to do with
MTUs, I proceeded to learn a fair bit more than I wanted to know.</p>

<p>Every network link has an MTU, and there are on-the-order-of a
dozen links between Here and There in any network interaction.
An MTU of 1500 is standard for Ethernet (modulo Jumbo Frames),
but you get MTUs in the high 1400s for various network
technologies (1500 minus "some bytes for our special stuff").
Needing an MTU under 1000 means "something is not right".</p>

<p>What happens if you send a packet that is too big to make the
next network hop?  In theory, the network device is allowed to
"fragment" the packet, and send on the pieces.  I believe this
sort of malarkey is frowned on, and many (most? (all?)) TCP
packets are sent with a DF (Do Not Fragment) flag.</p>

<p>So what happens with a DF-flagged too-big packet?  Well, the
device at the edge of the too-small-MTU chasm is supposed to send
back an ICMP packet telling of the problem, so that the
originator can re-send in smaller bundles.</p>

<p>A whole bunch of things can go wrong.  The device might simply
fail to send the required ICMP packet.  Any of the network
devices on the way back to the originator might choose to drop
the ICMP packet ("Security, you know.").  The host firewall -- or
the one on the last-hop consumer router -- stands a high chance
of Doing The Wrong Thing.</p>

<p>Net effect: Big packets go, but then just drop off the end of the
world.  Re-sending doesn't help because, odds are, they will
follow the same route and drop no less ignominiously.  Eventually
the connection will time out.</p>

<p>Your packets are said to have fallen into an <em>MTU black hole</em>.</p>

<p>Along the way, I learned that you can probe network MTUs by sending
various-sized ping packets.  (As in most networking problems
<code>ping</code> and <code>traceroute</code> are your friends.)  In Linux-speak, you
type...</p>

<pre><code>ping -M do -c 2 -s &lt;n&gt; remote.example.com
</code></pre>

<p>...varying <code>&lt;n&gt;</code>.  So, for instance, if</p>

<pre><code>ping -M do -c 2 -s 1465 remote.example.com
</code></pre>

<p>balks with <code>Frag needed and DF set</code>, but</p>

<pre><code>ping -M do -c 2 -s 1464 remote.example.com
</code></pre>

<p>does not, then 1464+28=1492 is the least MTU between here and there.
(<code>ping</code> has 28 bytes overhead which you need to add back in.)</p>

<p>Back to our own sad case.  Besides trying our fabulously-low MTU
of 512, we did all the other typical things, namely power-cycling
all the networking doodahs around the office.  In desperation, we
even did a Linux reboot (a mistake).  Nothing helped.</p>

<p>But when we woke up next morning and tried our stuff, everything
was fine.  Completely baffling.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Talking to a remote router&apos;s web server</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/11/talking-to-a-remote-routers-web-server.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.92</id>

    <published>2009-11-27T01:23:45Z</published>
    <updated>2009-11-27T11:17:15Z</updated>

    <summary>Got a small router in a faraway office that has a web interface? How to get at it (without turning on the &quot;allow access from public Internet&quot; thing)? Let&apos;s say it&apos;s 192.168.0.1 in the faraway office. So: SSH in with...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>Got a small router in a faraway office that has a web interface?
How to get at it (without turning on the "allow access from
public Internet" thing)?</p>

<p>Let's say it's 192.168.0.1 in the faraway office.  So: SSH in
with tunnel...</p>

<pre><code>ssh -L10080:192.168.0.1:80 me@faraway.example.com
</code></pre>

<p>Now use your own web browser to visit <code>http://localhost:10080/</code>;
it should be your faraway router's web interface.</p>

<p>Just another use of Fabulous SSH Tunneling...</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Removing duplicate files: what I want</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/10/removing-duplicate-files-what-i-want.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.91</id>

    <published>2009-10-26T00:23:45Z</published>
    <updated>2009-10-26T11:03:59Z</updated>

    <summary>There are a few Unix programs that will find duplicate files; none is great enough to merit any link love. (A similar tool I use quite a bit is hardlink, which doesn&apos;t tell you about duplicate files; it just hard...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>There are a few Unix programs that will find duplicate files;
none is great enough to merit any link love.</p>

<p>(A similar tool I use quite a bit is <code>hardlink</code>, which doesn't
tell you about duplicate files; it just hard links them together.
Which can be good.)</p>

<p>What I've really wanted, though, is a tool to which I could say
"See that pile #1 of directories over there?  And that pile #2
over there?  Tell me what files are in pile #1 that aren't in
pile #2."  Or vice versa.  Or maybe "the files that are in both
piles".</p>

<p>What I <em>don't</em> want to know is that there are four copies of a
file in piles #1 and #2 -- because they might all four be in pile #1.</p>

<p>For the special case of "files arranged identically in various
directories" (e.g. if those directories are backup copies),
here's something that comes close.</p>

<p>Over pile #1 of directories, run...</p>

<pre><code>find &lt;dirs&gt; -type f -print | xargs sha1sum &gt; my-file-list
</code></pre>

<p>The output, <code>my-file-list</code>, might need massaging.</p>

<p>Now go over to pile #2 of directories and run...</p>

<pre><code>sha1sum -c my-file-list
</code></pre>

<p>The goal is: like-named identical files should be flagged up as
'OK', and the others as something else.</p>

<p>Let's say you want to delete those like-named identical files.
It would then be something like (<em>WARNING: UNTESTED</em>)...</p>

<pre><code>sha1sum -c my-file-list | egrep ': OK$' \
| sed -e 's/: OK$//'    | xargs -r /bin/rm
</code></pre>

<p>(I always grow such scripts
<a href="http://blogs.verilab.com/partain/2009/01/incremental-working-as-a-unix-fundamental.html">incrementally</a>,
and by hand.  You'll have to
anyway if you've got filenames-with-spaces, or other fun.)</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Grid Engine: not so much</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/10/grid-engine-not-so-much.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.90</id>

    <published>2009-10-19T00:23:45Z</published>
    <updated>2009-10-19T11:05:23Z</updated>

    <summary>We decided a little while ago to break into the big bad world of Sun&apos;s Grid Engine (under Fedora Linux). We decided to start with a magnificent &quot;grid&quot; of... cough ONE NODE. yum install the packages, ./install_qmaster, ./install_execd and --...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>We decided a little while ago to break into the big bad world of
Sun's Grid Engine (under Fedora Linux).</p>

<p>We decided to start with a magnificent "grid" of... <em>cough</em> ONE
NODE.  <code>yum install</code> the packages, <code>./install_qmaster</code>,
<code>./install_execd</code> and -- bingo! -- major "grid" action!</p>

<p>Would that it were so simple.  Any question in plain English --
"How do I get it to run four jobs at once?" or "How do I change
the load threshold at which new jobs start to 1.1?" -- will be
met by an avalanche of Grid Engine Speak ("... shows values for
those attributes that are deﬁned as per queue instance slot
limits or as ﬁxed resource attributes").</p>

<p>Our big mistake, however, was to expand our "grid" to... <em>cough</em>
TWO NODES.</p>

<p>Much the same drill -- <code>yum install</code>, <code>./install_execd</code> -- and it
seemed to work, except when it doesn't.  The most baffling message
-- on the original node that worked perfectly by itself -- is:</p>

<pre><code>failed on host foo.verilab.com general before job because:
10/16/2009 19:05:35 [0:31072]: setuid(547) failed
</code></pre>

<p>Don't even know where to look.  Finally decided to ask on the
"gridengine-users" mailing list.</p>

<p>In Ye Olden Dayes, you'd just send email to the list, and maybe
be held in the moderator queue the first time.</p>

<p>Now, I've registered for the-deities-know-what with Sun, and
signed up as a project "observer", all so I can ask my
question.  Which was still held for "Pending Approval", as was
my follow-up.</p>

<p>So, for a week's effort, I've been asked "Are you running it as
root?"  (Er... I thought the point of setuid was for
lower-privileged users to do higher-privileged things... but
obviously I haven't a clue.)</p>

<p>Don't even know where to look.</p>

<p>Potential Grid Engine users: you may find <code>make -j 4</code> is easier.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Avoid the Puppet SVN pre-commit script</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/10/avoid-the-puppet-svn-pre-commit-script.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.89</id>

    <published>2009-10-07T00:23:45Z</published>
    <updated>2009-10-07T11:18:52Z</updated>

    <summary>There are common-ancestry variants of a Subversion pre-commit script for Puppet floating about, to check for syntax errors in your .pp files before committal. The shape of this script is roughly... svnlook ... | while ; do ...check... ; done...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>There are common-ancestry variants of a Subversion <code>pre-commit</code>
script for <a href="http://reductivelabs.com/trac/puppet">Puppet</a>
floating about, to check for syntax errors in your <code>.pp</code> files
before committal.</p>

<p>The shape of this script is roughly...</p>

<pre><code>svnlook ... | while ; do ...check... ; done
</code></pre>

<p>The <code>svnlook</code>-ish bit produces filenames to look at, and the
"check" inside the <code>while</code> runs <code>puppet --parseonly</code>, which
grumbles and gives a non-zero exit code if there's a syntax
error.</p>

<p>The script-as-a-whole needs to give a non-zero exit status if
there is a problem, i.e. "don't commit".</p>

<p>The script is burst because it's hard to get information from
inside the <code>while</code> loop so that a script-as-a-whole decision can
be made.</p>

<p>Why?  Because the parts of a Unix pipeline are run as
<em>sub-shells</em>, meaning the <code>while</code> loop is all in a sub-shell.
Any variable-setting and/or <code>exit</code>ing in the loop will only
affect the sub-shell and make absolutely no difference to the
script-as-a-whole.</p>

<p>Moral: <code>while</code> in a pipeline -- probably a bad idea.</p>

<p>(Just ask if you want my <code>pre-commit</code> script.  Mine checks <code>.erb</code>
template files, too.)</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Complex Subroutines in GNU Make</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/09/complex-subroutines-in-gnu-make.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.88</id>

    <published>2009-09-22T00:23:45Z</published>
    <updated>2009-09-22T10:14:09Z</updated>

    <summary>Another in a very short &quot;series&quot; about make. If you need a small make refresher, see this earlier entry. Here we consider a way to write complex (multi-parameter) &quot;subroutines&quot; for GNU Make; this is important for code reuse. Introduction foo.o...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>Another in a very short "series" about <code>make</code>.
If you need a small <code>make</code> refresher, see
<a href="http://blogs.verilab.com/partain/2009/09/the-humble-make-stanza.html">this earlier entry</a>.</p>

<p>Here we consider a way to write complex (multi-parameter)
"subroutines" for GNU Make; this is important for code reuse.</p>

<h1>Introduction</h1>

<pre><code>foo.o : foo.c
    gcc -O -c foo.c -o foo.o
</code></pre>

<p>"If <code>foo.c</code> is newer than <code>foo.o</code>, rebuild <code>foo.o</code> by running
<code>gcc -O -c foo.c</code>".  Files.  Timestamps.  Scripts (Bourne shell).</p>

<p>If <code>make</code> were nothing but that, it would be useful but
turgidly repetitive.  For example, in the <code>Makefile</code> for a C
program with 100 C files, you would have 100
nearly-identical stanzas like the example above.</p>

<p>If you decided to change from <code>gcc -O</code> to <code>gcc -O2</code>, you
would have 100 stanzas to change.</p>

<p>The first way we improve things is to lift out the
constants, using <code>make</code> variables; our example stanza might
then become:</p>

<pre><code>CC = gcc
CFLAGS = -O

foo.o : foo.c
    $(CC) $(CFLAGS) -c foo.c -o foo.o
</code></pre>

<p>Now we can change from <code>-O</code> to <code>-O2</code> with a single keystroke.</p>

<p><code>Make</code> also has a bunch of "automatic variables", which take
on certain values in the context of a specific target.  The
most important are:</p>

<pre><code>$@  the name of the target
$&lt;  the name of the first prerequisite
</code></pre>

<p>Our example now becomes:</p>

<pre><code>foo.o : foo.c
    $(CC) $(CFLAGS) -c $&lt; -o $@
</code></pre>

<h1>Simple Subroutines</h1>

<p>Imagine we were concerned with compiling two C files; we
might have:</p>

<pre><code>foo.o : foo.c
    $(CC) $(CFLAGS) -c $&lt; -o $@
bar.o : bar.c
    $(CC) $(CFLAGS) -c $&lt; -o $@
</code></pre>

<p>The two stanzas are identical except for the target-file
"stem".  Make lets us common-up <em>all</em> such stanzas with a
single <em>pattern rule</em>:</p>

<pre><code>%.o : %.c
    $(CC) $(CFLAGS) -c $&lt; -o $@
</code></pre>

<p>The bit of the filename matched by the <code>%</code> in the pattern is
called the "stem" and is available in the automatic variable
<code>$*</code>; so the rule could also be written as:</p>

<pre><code>%.o : %.c
    $(CC) $(CFLAGS) -c $*.c -o $*.o
</code></pre>

<p>A useful way to think of a pattern rule is as a "make
subroutine" with exactly one parameter (the stem), which you
have to hide in the target filename.  (Pretty horrible
programming-language syntax, but...)</p>

<p>Pattern rules are a good idea in 100% of the cases where
they apply.</p>

<h1>Complex Subroutines</h1>

<p>In our EDA world, tool invocation is rarely simple enough to be
susceptible to the wiles of a ("single parameter") pattern rule.</p>

<p>Here's how you might compile a bunch of Verilog with Icarus:</p>

<pre><code>a.out : $(SRCS_V)
    iverilog -o $@ -Idir1 -Idir2 -y dir1 -y dir2 $(SRCS_V)
</code></pre>

<p>After we lift out constants, we might get something like (and
this is still simplified compared to Real Life...):</p>

<pre><code>IVL     = iverilog
INCDIRS = dir1 dir2
LIBDIRS = $(INCDIRS)

a.out : $(SRCS_V)
    $(IVL) -o $@ $(INCDIRS:%=-I%) $(LIBDIRS:%=-y%) $(SRCS_V)
</code></pre>

<p>(Note: <code>$(INCDIRS:%=-I%)</code> means "take the value of the
variable <code>INCDIRS</code> and put a <code>-I</code> in front of every word".)</p>

<p>The problem here is that the "rule" for compiling Verilog
with Icarus has several "parameters":</p>

<ul>
<li>the name of the desired output (<code>a.out</code>)</li>
<li>the include directories required (<code>INCDIRS</code>)</li>
<li>the library directories required (<code>LIBDIRS</code>)</li>
<li>the source files to feed in (<code>SRCS_V</code>)</li>
</ul>

<p>There could easily be a few more.  There is simply no way to
squeeze all those parameters through a normal <code>make</code> pattern
rule at once.  (Well, you <em>could</em>... You could encode all
needed information in the target name.  You might say:</p>

<pre><code>compile : compile++a.out++INCDIRS=dir1+dir2++LIBDIRS=dir1+dir2++
</code></pre>

<p>and then have a pattern rule:</p>

<pre><code>compile% : # now $* includes all the extra info
    &lt;fancy script that unpicks $* and then does what it likes&gt;
</code></pre>

<p>But that's really, really horrible!)</p>

<p>What we want is to be able to vary the value of the
"parameter variables" depending on the target that <code>make</code> is
trying to rebuild.  Happily, GNU Make provides an obscure
feature to do this: target-specific make variables.</p>

<p>I propose the convention that target-specific variables are
in lower case, so they stand out from regular variables
which are (by convention) in upper case.</p>

<p>You can make a very simple Makefile to see target-specific
variables in action:</p>

<pre><code>a : foo = a_value
b : foo = b_value

a :
    @echo foo=$(foo)
b :
    @echo foo=$(foo)
c :
    @echo foo=$(foo)
</code></pre>

<p>When you run <code>make a b c</code>, you will see:</p>

<pre><code>foo=a_value
foo=b_value
foo=
</code></pre>

<p>Back to our Icarus Verilog example.  We might end up with a
rule something like:</p>

<pre><code>%.icarus : $(srcs_v)
    $(IVL) -o $@ $(incdirs:%=-I%) $(libdirs:%=-y%) $(srcs_v)
</code></pre>

<p>This is, in effect, a "subroutine" for GNU Make with three
parameters.  We can now invoke this "subroutine" for two (or
more) different Icarus runs:</p>

<pre><code>foo.icarus : srcs_v  = foo_top.v
foo.icarus : incdirs = include
foo.icarus : libdirs = lib

bar.icarus : srcs_v  = bar_top.v bar_extra.v
bar.icarus : incdirs = wibble wobble
bar.icarus : libdirs = $(incdirs)
</code></pre>

<p>Nobody said it was pretty, but it does mean that an
arbitrarily-complex script for doing a piece of an EDA flow <em>can</em>
be captured by a pattern rule (using target-specific <code>make</code>
variables) -- in exactly one place.  This can only be good for
increasing the robustness of any <code>make</code>-based build system.</p>

<p>[An earlier version of this note appeared in Verilab's internal newsletter.]</p>
]]>
        

    </content>
</entry>

<entry>
    <title>The Humble Make Stanza</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/09/the-humble-make-stanza.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.87</id>

    <published>2009-09-09T00:23:45Z</published>
    <updated>2009-09-09T13:55:14Z</updated>

    <summary>I once had a spell looking at a lot of Makefiles. It prompted this ode to the Humble Make Stanza. A make Refresher Make is a tool to control the rebuilding of files so that unnecessary work is avoided. The...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I once had a spell looking at a lot of Makefiles. It prompted this
ode to the Humble Make Stanza.</p>

<h1>A <code>make</code> Refresher</h1>

<p><code>Make</code> is a tool to control the rebuilding of files so that
unnecessary work is avoided.  The simplest construct in <code>make</code> is a
<em>stanza</em>:</p>

<pre><code>target(s) : prerequisite(s)
    script(s)
</code></pre>

<p>Meaning: If the <code>target</code> file doesn't exist or if one of the
<code>prerequisite</code> files is newer, then rebuild the <code>target</code> file(s) by
running <code>script(s)</code>.  Here's a specific example:</p>

<pre><code>foo.o : foo.c
    gcc -O -c foo.c -o foo.o
</code></pre>

<p>"If foo.c is newer than foo.o, rebuild foo.o by running
<code>gcc -O -c foo.c</code>".</p>

<p>There's really no magic.  You could just as easily write:</p>

<pre><code>foo.o : foo.c
    echo my granny wears army boots
</code></pre>

<p>which is barking mad, but absolutely fine by <code>make</code>.</p>

<p>Files.  Timestamps.  Shell scripts (Bourne shell).  What could be
simpler?</p>

<h1><code>make</code> in Full Glory</h1>

<p>The ideal <code>make</code> stanza does a single step in a potentially complex
workflow. <code>make</code> stitches together many such
stanzas into a beautiful harmonious whole.  Let us remind ourselves of
what we are hoping for:</p>

<ul>
<li><p>The <code>make</code> infrastructure will capture the whole workflow.  Type
<code>make world</code> and the full workflow bhoona will unfold before your eyes, the
right steps in the right order.</p></li>
<li><p>When something changes, the steps that <em>must</em> be re-done will
<em>definitely</em> be re-run.  (Anything else is a disaster.)</p></li>
<li><p>When something changes, <em>only</em> the steps that must be re-done will
be re-run.  Not essential but preferred.  For example: it's a pain
that, when you change a comment in a header file, <code>make</code> will
re-compile great swathes of source code.  We live with it.</p></li>
<li><p>When something goes wrong, <code>make</code> will not proceed pointlessly
(e.g. do a doomed seven-hour simulation even though the preceding
compilation failed) and will make it relatively clear where matters
came unstuck.</p></li>
<li><p>If a previous run was interrupted, <code>make</code> will restart at the right
place (and not be confused).</p></li>
<li><p>If the IT infrastructure has provision for parallel simulations
(e.g. a batch queuing system), <code>make</code> will use that
appropriately.</p></li>
</ul>

<h1>The Naked Stanza</h1>

<p>You will undoubtedly not recognize your own experience with <code>make</code>
from the preceding idyllic description.  To understand why, let's look
more closely at the soul of the <code>make</code> party, the stanza:</p>

<pre><code>target(s) [hidden t's] : prerequisite(s) [hidden prereq's]
    script(s)
</code></pre>

<p>The core task to be performed by the script -- e.g. "run VCS with
these command-line arguments" -- is often straightforward and easy to
get right.  So where do problems arise?</p>

<ul>
<li><p>The script(s) will read a bunch of files -- <code>prerequisite(s)</code> -- and
probably write at least one -- <code>target(s)</code>.</p>

<p>Unless carefully done, the scripts will probably have <em>hidden</em>
prerequisites (files read but not specifically listed in the
Makefile) and hidden targets (files written but similarly unlisted).</p></li>
<li><p>If you don't attend to the "hidden" prerequisites (e.g. by buying
ClearCase and then using <code>clearmake</code>, which tracks such
dependencies through "audited builds"), then you should stop using
<code>make</code> straight away -- you have no assurance that crucial
rebuilding will happen when it really needs to.</p>

<p>(Of course, there is the <code>make</code> equivalent of the Windows
re-install: in the face of the slightest hiccup, type <code>make clean</code>.
Why someone with a lousy Makefile trusts the <code>clean</code> target continues
to elude me, however.)</p></li>
<li><p>Hidden targets are
less of an issue.  They may cause needless extra rebuilding by
<code>make</code>, but that may not be a hanging offense.</p></li>
<li><p>A problem that arises with some EDA tools is that they are
disinclined to do a single unit of work (and then get out of the
way).  They often want to set up a hidden universe of their own
(often with dubious Makefiles hidden somewhere), and/or they decide
they should offer counseling before getting on with the job ("Mr
Kelly, are you really sure you want to merge those two files? Click
OK if you're absolutely certain you want to proceed, Cancel
otherwise; this dialog box comes with no warranty expressed or
implied.")</p></li>
<li><p>The script(s) needs to offer an absolute assurance to the downstream
parts of the overall "flow" that they have done their part
successfully.  This is usually by setting the exit status correctly
(zero for OK, anything else for not).</p>

<p>EDA tools are imperfect on this exit-status thing.  Some
just don't bother to set the exit-status to anything
sensible.  For others (notably simulators), the tool's idea of "not
OK" may reasonably differ from your own.</p>

<p>For example, a simulation may run to completion (exit 0) but it may
have properties which mean its results must <em>not</em> be passed on to
the next step of the workflow.</p></li>
<li><p>The assurance of job-well-done also needs to be recorded for
posterity in a file-with-timestamp (because that's <code>make</code>
understands).  This might be called a "witness" file -- it witnesses
that a particular task was done successfully at a certain time.</p>

<p>In many cases such as the <code>.c</code>-to-<code>.o</code> example given at the beginning,
the <code>.o</code> file is the witness (as well as the object file that you
really want).  In its witness capacity, it is saying "A successful C
compilation of <code>foo.c</code> took place on November 22 at 11:12am."</p>

<p>In the more complex EDA world, I tend to favor explicit witness (or
"stamp" or "sentinel") files: <code>sim-4732-DONE</code> witnesses that
simulation 4732 completed successfully at the time indicated.</p></li>
<li><p>Another thing the scripts need to do is leave a trail of what
happened.  This is often in the form of log files.</p></li>
<li><p>Even better than "what happened" is "what happened that was
<em>interesting</em>".  A definite weakness of EDA flows is that they spew
voluminously, and it is very easy to miss the One Crucial Message
that shows something amiss.</p></li>
</ul>

<h1>The Model Stanza</h1>

<p>The observations above hint at what a "normal" make stanza should look
like in an EDA -- or similarly complicated -- flow.  Here it is:</p>

<pre><code>witness-file : prerequisite(s)
    /bin/rm -f witness-file log-file
    eda-tool ...args... 2&gt;&amp;1 | tee-grep log-file
    update-dependencies
    decide-status log-file
    /bin/touch witness-file

-include witness-file.P  # dependencies from update-dependencies
</code></pre>

<ul>
<li><p>The goal is to produce the <code>witness-file</code>; if it exists, it means
"this steps completed successfully at the time indicated".</p></li>
<li><p><code>eda-tool ...args...</code> is just the normal command-line invocation of
the EDA tool.</p></li>
<li><p>We pipe the output(s) of the EDA tool to <code>tee-grep</code>, a
(hypothetical) tool whose jobs are: (a) to squirrel away a copy of the
log file for later perusal; and (b) to display <em>only</em> the log
messages that are "interesting" (for the local definition of same).</p></li>
<li><p>If you don't have ClearCase or equivalent, you will need some special
pleading to update the dependencies related to this step; I have
hypothesized an <code>update-dependencies</code> tool, and that it puts its
output into <code>witness-file.P</code>, which <code>make</code> slurps in (last line
<code>-include</code>).</p></li>
<li><p>All of the scripts before <code>decide-status</code> should have completed
successfully (exit status zero).  (If not, something bad is wrong,
e.g. out of disk space.)  The question remains: can workflow steps
that are relying on this one "succeeding" go ahead?  That's what the
<code>decide-status</code> script does: exit status zero will mean "go ahead"
(after stamping the <code>witness-file</code>), non-zero exit means "go no
further".</p></li>
</ul>

<p><code>Make</code> has a reputation for being flaky and hard to work with.
If each <code>make</code> stanza were as bullet-proof as suggested here,
most of that problem would go away.</p>

<p>[An earlier version of this note appeared in Verilab's internal newsletter.]</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Hands off my boot boot</title>
    <link rel="alternate" type="text/html" href="http://blogs.verilab.com/partain/2009/09/hands-off-my-boot-boot.html" />
    <id>tag:blogs.verilab.com,2009:/partain//2.86</id>

    <published>2009-09-03T00:23:45Z</published>
    <updated>2009-09-03T13:13:23Z</updated>

    <summary>I did a Fedora 11 install and, not unusually, the machine wouldn&apos;t boot afterward -- GRUB bootloader gunk messed up. Attempted a normal remedy: Boot &quot;rescuely&quot; with the install CD, do chroot /mnt/sysimage and run... grub-install --no-floppy --root-directory=/boot &apos;(hd0)&apos; (This...</summary>
    <author>
        <name>Will Partain</name>
        
    </author>
    
    
    <content type="html" xml:lang="en-US" xml:base="http://blogs.verilab.com/partain/">
        <![CDATA[<p>I did a Fedora 11 install and, not unusually, the machine
wouldn't boot afterward -- GRUB bootloader gunk messed up.</p>

<p>Attempted a normal remedy: Boot "rescuely" with the install CD,
do <code>chroot /mnt/sysimage</code> and run...</p>

<pre><code>grub-install --no-floppy --root-directory=/boot '(hd0)'
</code></pre>

<p>(This machine is all software RAID-1, with a separate <code>/boot</code>
partition.)</p>

<p>I reboot... into a GRUB prompt (<em>sigh</em>).  It has not found the <code>grub.conf</code>
(or <code>menu.lst</code> -- I never know which one it really looks for).
It is suspiciously easy to get it to do the right thing:</p>

<pre><code>grub&gt; configfile (hd0,0)/grub/grub.conf
</code></pre>

<p>Hmm...  How is that different from what GRUB should have done on
its own?</p>

<p>Turns out that <code>grub-install</code> didn't do the right -- or expected
-- thing.  It dropped its stuff <em>inside</em> <code>/boot</code>, so I ended up
with <code>/boot/boot/grub</code> -- but with no <code>grub.conf</code> menu file in
that directory.  Working GRUB but no menu -- the symptoms seen.</p>

<p>A GRUB install done the low-tech (interactive) way sorted things:</p>

<pre><code>grub&gt; root (hd0,0)
grub&gt; setup (hd0)
</code></pre>
]]>
        

    </content>
</entry>

</feed>
