
Why does ctl_cyrusdb -r take so long with Berkeley DB?
******************************************************

Berkeley DB maintains a log of all the transactions since the last
checkpoint of the database. In order to ensure the database is in a
consistent state, you must recover the log after any outage (thus the
recommendation to run these processes when Cyrus starts). They can
take a long time for a few reasons.

The most common one is that you need to checkpoint the cyrusdb more
often. This can be done with a simple "ctl_cyrusdb -c". If you do this
very often, the amount of log that needs to be recovered will be
significantly shorter. We recommend doing this at least once every
half hour, and more often on busy sites.

The other reason is that your deliver.db may be very large. This is
solvable by increasing the pruning interval (the -E parameter to
ctl_deliver, which you should run on a regular basis), or (in a pinch)
by just removing the database (since the effects of losing it do not
prevent operation, they just cause vacation messages to be resent, and
duplicate delivery suppression to possibly deliver duplicates).

* "by increasing the pruning interval": My understanding is that the

     number after "-E" is the number of days after which entries are
     discarded. Is there a way to reduce it to a number of hours?
     Since most of our mail is internal mail should rarely be delayed
     by more then an hour or two.

* In case it's useful to anyone we discovered that moving
  /var/lib/imap

     from an ext2 to an ext3 journaled filesystem made a vast
     difference for the worse. While recovering the database Berkeley
     DB does a vast quantity of small writes and that combined with
     the updates to the journal absolutely kills disk performance
     (with journalling it was taking about 40 minutes to start Cyrus
     on a mail server with about 500 users and 200G of mail). On the
     flip side moving /var/lib/imap to a hardware RAID system with a
     decent amount of onboard cache reduced this time to under 30
     seconds. I think Berkeley DB could probably be optimized to deal
     with this better, but in the mean time avoid journaling
     filesystems, or at least be prepared to experiment to find
     something that works for you.
