I’ve been fighting a bug with Junos Olive VMs running under KVM on a CentOS server for the last few days. I use Olive images now and then for network labs and to test configurations, and lately they’re not running very well at all on my Linux KVM server. Here’s a quick post on the problem and how to fix it.
It seems that at some point over the last few months a Linux kernel update has introduced a bug which affects OpenBSD systems running under KVM.
The symptoms with Junos Olive are:
- Pings run erratically – sometimes one every 2-3 seconds, sometimes 20 to 30 seconds between pings
- Routing protocols constantly losing their adjacencies and routes flapping
- Errors in the messages log file complaining of time-related faults, like the ones below
Jul 15 17:27:20 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 17 sec 994098 usec kevent block
Jul 15 17:27:44 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 12 sec 631408 usec kevent block
Jul 15 17:28:06 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 21 sec 628374 usec kevent block
Jul 15 17:28:19 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 12 sec 907911 usec kevent block
Jul 15 17:28:32 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 13 sec 81823 usec kevent block
Jul 15 17:28:58 R1 rpd: JTASK_SCHED_SLIP_KEVENT: 26 sec 525010 usec kevent block
It turns out that this is a bug in the Linux kernel running Qemu/KVM which affects OpenBSD guests, of which Junos Olive is based on.
To fix this, run the following command to add a parameter to the kvm_intel kernel module to disable the preemption timer:
echo 'options kvm_intel preemption_timer=N' >/etc/modprobe.d/kvm_intel.conf
… and reboot your server.
This will fix the problem, and you can go back to running your Olive images without all of your routing protocols flapping. According to the Linux developers, this fault will be fixed in an upcoming kernel release.