the summary
Sep. 11th, 2003 01:33 amfailing RAID controller ->
random crashes ->
file system corruption ->
cyrus imapd mailbox and user directory database corruption ->
mail down for several hours while I:
The problem seemed to be the result of lots of cyrus deliver processes building up, being unable to actually deliver the mail into people's mailboxes. They'd simply hang there forever, more and more of them, until the system would die, painfully. It wouldn't just get slow, it would go into a weird system-call-kernel-panic loop. It was very unpleasant.
I LOVE being a sysadmin.
Now I'm home and feeling vaguely mopey. Time to sleep.
random crashes ->
file system corruption ->
cyrus imapd mailbox and user directory database corruption ->
mail down for several hours while I:
- move the disks from the dead system to a spare system
- fsck the file system and reboot a bunch of times, watching it crash each time
- back up the mail I can (26GB)
- rebuild the file system
- restore the mail I backed up previously
- restart the mail server
- discover that cyrus imapd still won't start
- rebuild the cyrus mailboxes.db database (a painstaking process)
- rebuild all the user databases as well
- finally have everything working again
The problem seemed to be the result of lots of cyrus deliver processes building up, being unable to actually deliver the mail into people's mailboxes. They'd simply hang there forever, more and more of them, until the system would die, painfully. It wouldn't just get slow, it would go into a weird system-call-kernel-panic loop. It was very unpleasant.
I LOVE being a sysadmin.
Now I'm home and feeling vaguely mopey. Time to sleep.