Continuous Integration for Virtualization Solutions

Markus Partheymüller (Cyberus Technology GmbH),
Sebastian Eydam (Cyberus Technology GmbH)

When developing low-level systems software like operating systems or
virtualization solutions, achieving a tight feedback loop for code changes can
be very challenging. Tests in an artificial environment (e.g., unit tests) are
a great tool for certain aspects, but they build assumptions about the hardware
made by the developer directly into the test rather than validating correctness
using the hardware itself. The underlying hardware has so many facets that
real-world integration tests are inevitable. Having a dedicated test machine
for each developer can help run larger integration tests, but as soon as
multiple variants or generations of hardware platforms are at play, it becomes
impractical very quickly. Enter automated hardware testing, which allows
developers to submit their changes to a test system that can execute all
desired tests on all available platforms and report back the results, including
log output for troubleshooting. At Cyberus, we develop a virtualization
platform based on the Hedron Hypervisor (a fork of NOVA, which was conceived at
the TU Dresden) with complex testing needs ranging from simple minimal OS
kernels for targeted tests all the way to running benchmarks to determine the
performance of workloads using accelerated graphics in virtual machines. We use
NixOS to create reproducible test images running a full Linux instance
including a display server and the corresponding VM workloads, package them up
and send them to our automated test facility. The test system is able to
orchestrate server machines as well as laptops and has automation for powering
them on and off, serving the test package, and collecting output via serial
consoles. While the test packages are built in a way that makes it simple to
run them locally at the desk, our developers have the opportunity to submit
those tests to a larger variety of platforms and let them run in the background
while focusing on other tasks. Multiple identical machines are used to speed up
the process and support a growing organization with more developers working on
the product. Our goal is to get everyone at the company in the mindset that no
simple one-line change is too small to go through intensive testing, and we are
constantly improving our tooling and infrastructure to achieve this goal. These
same tests are also used as quality gates in the code review process, enabling
us to only ship code that has gone through extensive testing without any manual
intervention.