Validating ansible playbooks with Vagrant in open source projects

At AdoptOpenJDK we do everything we can in the open. The source code is in the open, all of the testing we run is in the open, and the machine configuration is all performed via Ansible. We let anyone contribute to any part of this. While this is a great way to work it can cause some issues. Notably we have had situations where changes have been made but they have had undesirable side effects. For example if a change is put in on one platform (e.g.  one particular Linux distribution) it may cause unintended side effects on others.

In order to fully mitigate these possibilities we have implemented a Vagrant-based solution for validating playbooks on Linux and Windows. If you're not familiar with it, Vagrant is a command-line based solution which utilises virtualisation software on your machine (VirtualBox, KVM, Parallels, Hyper-V, VMware) to create and manage VMs. We use this to create virtual machines of varying distributions, run the playbooks on  them, then run a java build and basic java test using our test harnesses to validate that the machine is configured appropriately. The flow is as follows:

  1. Create a virtual machine with the appropriate operating system
  2. Run the ansible playbooks against the virtual machine
  3. Run a build of Java against the virtual machine to validate that the playbooks have configured the machine correctly
  4. Run a basic test to ensure that the machine is configured in a way that allows the test harnesses to work
By doing this across multiple operating systems in the virtual machines, we can validate that the playbooks have not got into an unsuitable state. We are currently running this through a jenkins infrastructure where we can point the code at other forks of the github repository which hosts out ansible scripts, this allowing PR reviewers to check for any potential problems prior to integrating.

In the past we have had issues where running the playbooks on an existing pre-configured machine didn't show up any problems (because subtle changes were made that didn't require those parts of the playbooks to get re-executed) but only showed up on clean machines. By using Vagrant we protect ourselves against that because the machines are always configured from a relatively clean state without having had the playbooks run against them. It can take a bit of time (about an hour for the whole lot on a Linux machine, longer on Windows because we have larger prereqs on there) We can almost certainly hook this into some automation later to allow it to automatically run these checks when a new PR is submitted.

For now, this will allow the ansible playbooks we use to be as stable as possible. You can look at the automation we have around this at https://github.com/AdoptOpenJDK/openjdk-infrastructure/tree/master/ansible/pbTestScripts - the "testScript.sh" in there is the one that runs a test given an OS to run against.

Comments

Popular posts from this blog

macOS - first experiences from a Linux user perspective

Antisocial Networking: List of Top Tips

Customer service: contacting banks via the internet - how hard can it be?