Ansible: Performance Impact of the Python version

Until recently, I was not really paying attention to the version of Python I was using with Ansible, this as soon as it was Python3. The default version was always good enough for Ansible.

During the last weeks, I spent the majority of my time working on the performance the community.kubernetes collection. The modules of these collection depend on a large library (OpenShift SDK) and Python needs to reload it before every task execution. The goal was to benefit from what is already in place with vmware.vmware_rest: See: my AnsibleFest presentation.

And while working on this, I realized that my metrics were not consistent, I was not able to reproduce some test-cases that I did 2 months ago. After a quick investigation, the Python version matters much more than expected.

To compare the different Python versions, I decided to run some tests.

The target host is a t2.medium instance (2 vCPUS, 4GiB) running on AWS. And the Operating system is Fedora 33, which is really handy for this because it ships all the Python versions from 3.6 to 3.10!

I use the last stable version of Ansible (2.10.3) that I install with pip in a Python virtual environment. The list of the dependencies present in the virtualenvs.

Finally, I deploy Kubernetes on Podman with Kubernetes Kind.

For the first test, I use a Python one-liner to evaluate the time Python takes to load the OpenShift SDK. This is one of the operations that I want to optimize for my work and so it matters a lot to me.

for i in $(seq 100); do
echo ${i}
venv-${python}/bin/python -c 'from openshift.dynamic import DynamicClient' >> /dev/null 2>&1
view raw hosted with ❤ by GitHub

Here the loading is done 100 times in a row.

The result shows a steady improvement of the performance since Python 3.6.

time (sec)48.40145.08841.75140.92440.385

With this test, the loading of the SDK is 16.5% faster with Python 3.10.

The next test does the same thing, but this time through Ansible. My test uses the following playbook:

hosts: localhost
gather_facts: false
kind: Pod
with_sequence: count=100
view raw gistfile1.yaml hosted with ❤ by GitHub

It runs the k8s_info module 100 times in a row. In addition, I also use an ansible.cfg with the following content. This way, ansible-playbook returns a nice output of the task execution duration:

callback_whitelist = ansible.posix.profile_tasks
view raw ansible.cfg hosted with ❤ by GitHub
time (sec)85.580.575.3575.0571.19

It’s a 16.76% boost between Python 3.6 and Python 3.10. I was not expecting such tight correlation between the two tests.

While Python is obviously not the faster technology out there, it’s great to see how its performance are getting better release after release. Python 3.10 is not even released yet and looks promising.

If your playbooks use some modules with dependency on large Python library, it may be interesting to give a try to the lastest Python versions.

And for those who are still running Python 2.7, I get a 49.2% the performance boost between 2.7 and 3.10.

How to start minishift on Fedora-33

update: minishift is bascially dead and won’t support OpenShift 4. You probably want to use crc instead.

I just spent to much time trying to start minishift with my user account. After 3h fighting with permission issues between libvirt and the ~/.minishift directory, I’ve finally decided to stay sane and use sudo… This is how I start minishift on my Fedora-33.

Install and start libvirt

sudo dnf install -y libvirt qemu-kvm
sudo systemctl start libvirtd

Fetch and install minishift and the kvm driver

sudo dnf install -y origin-clients
curl -L|sudo tar -C /usr/local/bin -xvz minishift-1.34.3-linux-amd64/minishift  --strip-com
sudo curl -L -o /usr/local/bin/docker-machine-driver-kvm
sudo chmod +x /usr/local/bin/docker-machine-driver-kvm

Start minishift (with sudo…)

sudo minishift

And configure the local user

sudo cp -r /root/.kube /home/goneri
sudo chown -R goneri:goneri /home/goneri/.kube