Monday, March 30, 2009

Xen and keeping time: NTP, DomU's and hwclock

There's plenty of discussion out there on keeping time in Xen DomU's. Unfortunately there is no one definitive source of information.

After much research I've come to the conclusion that the following is the best way to keep all Dom clocks synchronized:
  • Set independant_wallclock=0 on all Dom's
  • Run NTP in Dom0 only (not OpenNTP, and not in DomU's)
  • Cron "hwclock --systohc" to run daily or twice daily, but never more than once every 55 minutes.
Here is a more in depth explanation of the possible ways of keeping time, and the reasoning behind using the above solution.
  1. The default method for keeping time is for the DomU's to get their time directly from Dom0.
  2. An alternative methods involves setting each DomU to have an independant wallclock, then to run NTP in the DomU's
I used the second method fairly successfully for a long time, but then some flaws in this solution became apparent. Primarily, Dovecot IMAP/POP3 would constantly complain that TimeWentBackwards. This highlighted to me that using an independant wallclock like this wasn't working well. While other processes didn't complain, I can imagine that a changing clock could easily confuse several things.

I then set the DomU's back to the default independant_wallclock=0, disabled NTP on the DomU's and only ran NTP in Dom0. Unfortunately there was still a significant drift in the DomU's even though Dom0 was correct, however the DomU's all seemed to have very similar drive relative to the Dom0. There are a couple of reasons for this.

OpenNTP doesn't set the Dom0 clock in the same way that NTPd does, and so NTPd works out of the box much better. But this still didn't solve the problem. It seems that something changed between ntpd and/or kernel versions.

I believe ntpd used to make use of the "11 minite mode" outlined in the hwclock man page. After some investigation it turned out this was not the case, and can be confirmed by using "adjclockx --print". If the status value isn't 64, or doesn't have the 64 bit set, then "11 minute mode" is on. Despite various sources indicating that ntpd would automatically turn on "11 minute mode", it didn't. Further investigation seems to indicate that "11 minute mode" is not a Good Thing(tm), so I'm guessing that's why it doesn't get set when ntpd is started. A greater discussion about this can be found here among many other places.

So we now know that NTPd is prefered over OpenNTP, and that using hwclock is prefered over "11 minute mode". And that's why I use the settings outlined above.