Tag: python

  • Ansible: Performance Impact of the Python version

    Until recently, I was not really paying attention to the version of Python I was using with Ansible, this as soon as it was Python3. The default version was always good enough for Ansible.

    During the last weeks, I spent the majority of my time working on the performance the community.kubernetes collection. The modules of these collection depend on a large library (OpenShift SDK) and Python needs to reload it before every task execution. The goal was to benefit from what is already in place with vmware.vmware_rest: See: my AnsibleFest presentation.

    And while working on this, I realized that my metrics were not consistent, I was not able to reproduce some test-cases that I did 2 months ago. After a quick investigation, the Python version matters much more than expected.

    To compare the different Python versions, I decided to run some tests.

    The target host is a t2.medium instance (2 vCPUS, 4GiB) running on AWS. And the Operating system is Fedora 33, which is really handy for this because it ships all the Python versions from 3.6 to 3.10!

    I use the last stable version of Ansible (2.10.3) that I install with pip in a Python virtual environment. The list of the dependencies present in the virtualenvs.

    Finally, I deploy Kubernetes on Podman with Kubernetes Kind.

    For the first test, I use a Python one-liner to evaluate the time Python takes to load the OpenShift SDK. This is one of the operations that I want to optimize for my work and so it matters a lot to me.

    https://gist.github.com/goneri/c4f8ec63d0c51f7e6236173b2c60db66

    Here the loading is done 100 times in a row.

    The result shows a steady improvement of the performance since Python 3.6.

    Python3.6Python3.7Python3.8Python3.9Python3.10
    time (sec)48.40145.08841.75140.92440.385

    With this test, the loading of the SDK is 16.5% faster with Python 3.10.

    The next test does the same thing, but this time through Ansible. My test uses the following playbook:

    https://gist.github.com/goneri/ad252e30d48cfea99aaeb2e18736303e

    It runs the k8s_info module 100 times in a row. In addition, I also use an ansible.cfg with the following content. This way, ansible-playbook returns a nice output of the task execution duration:

    https://gist.github.com/goneri/e364c17d6344fd4cd11c1ed2e0ba12ce
    Python3.6Python3.7Python3.8Python3.9Python3.10
    time (sec)85.580.575.3575.0571.19

    It’s a 16.76% boost between Python 3.6 and Python 3.10. I was not expecting such tight correlation between the two tests.

    While Python is obviously not the faster technology out there, it’s great to see how its performance are getting better release after release. Python 3.10 is not even released yet and looks promising.

    If your playbooks use some modules with dependency on large Python library, it may be interesting to give a try to the lastest Python versions.

    And for those who are still running Python 2.7, I get a 49.2% the performance boost between 2.7 and 3.10.