How to master environment variables

If you have been using Linux for a while you might have encountered the term “environment variables” a few times. You might even have run the command export FOO=bar occasionally. But what are environment variables really and how can you master them?

In this post I will go through how you can manipulate environment variables both permanently and temporarily. Lastly I will round up with some tips on how to properly use environment variables in Ansible.

Check your environment

So what is your environment? You can inspect it by running env on the command line and search with a simple grep:

$ env
COLORTERM=truecolor
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
DESKTOP_SESSION=gnome
DISPLAY=:1
GDMSESSION=gnome
GDM_LANG=en_US.UTF-8
GJS_DEBUG_OUTPUT=stderr
GJS_DEBUG_TOPICS=JS ERROR;JS LOG

... snip ...

$ env | grep -i path
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
OMF_PATH=/home/ephracis/.local/share/omf
PATH=/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin
WINDOWPATH=2

So what are all these variables coming from and how can we change them or add more, both permanently and temporarily?

Know your session

Before we can talk about how the environment is created and populated, we need to understand how sessions work. Different kinds of sessions reads different files for populating their environment.

Login shells

Login shells are created when you SSH to the server, or login at the physical terminal. These are easy to spot since you need to actually log in (hence the name) to the server in order to create the session. You can also identify these sessions by noting the small dash in front of the shell name when you run ps -f:

$ ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
ephracis 23382 23375  0 10:59 pts/0    00:00:00 -fish
ephracis 23957 23382  0 11:06 pts/0    00:00:00 ps -f

Interactive shells

Interactive shells are the ones that reads your input. This means most sessions that you, the human, are working with. For example every tab in your graphical terminal app is an interactive shell. Note that this means that the session created when you login to your server either over SSH or from the physical terminal, is both an interactive and a login shell.

Non-interactive shells

Apart from interactive shells (of which login shells are a subset) we have non-interactive shells. These would be the ones created from various scripts and tools that do not attach anything to stdin and can thus not provide interactive input to the session.

Know your environment files

Now when we know about the different types of sessions that can be created, we can start to talk about how the environment of these sessions are populated with variables. On most systems we use Bash since that’s the default shell on virtually all distributions. But you might have changed this to some other shell like Zsh or Fish, especially on your workstation where you spend most of your time. What kind of shell you use will determine which files are used to populate the environment.

Bash

Bash will look for the following files:

  • /etc/profile
    Run for login shells.
  • ~/.bash_profile
    Run for login shells.
  • /etc/bashrc
    Run for interactive, non-login shells.
  • ~/.bashrc
    Run for interactive, non-login shells.

That part about non-login is important and a reason why many users and distributions configure bash_profile to read bashrc so that it is applied in all sessions, like so:

[[ -r ~/.bashrc ]] && . ~/.bashrc

Zsh

Zsh will look for a bit more files than Bash does:

  • /etc/zshenv
    Run for every zsh shell.
  • ~/.zshenv
    Run for every zsh shell.
  • /etc/zprofile
    Run for login shells.
  • ~/.zprofile
    Run for login shells.
  • /etc/zshrc
    Run for interactive shells.
  • ~/.zshrc
    Run for interactive shells.
  • /etc/zlogin
    Run for login shells.
  • ~/.zlogin
    Run for login shells.

Fish

Fish will read the following files on start up:

  • /etc/fish/config.fish
    Run for every fish shell.
  • /etc/fish/conf.d/*.fish
    Run for every fish shell.
  • ~/.config/fish/config.fish
    Run for every fish shell.
  • ~/.config/fish/conf.d/*.fish
    Run for every fish shell.

As you can see Fish does not distinguish between login shell and interactive shells when it reads its startup files. If you need to run something only on login or interactive shells you can use if status --is-login or if status --is-interactive inside your scripts.

Manipulate the environment

So that’s a bit complicated but hopefully things are more clear now. Next step is to start manipulating the environment. First of all, you can obviously edit those files and wait until the next session is created, or load the newly edited file into your current session using either source /path/to/file or the shorthand . /path/to/file. That would be the way to make permanent changes to your environment. But sometimes you only want to change this temporarily.

To apply variables for a single command you just insert it to the beginning of the command like so:

# for bash or zsh
$ FOO=one BAR=two my_cool_command ...

# for fish
$ env FOO=one BAR=two my_cool_command ...

This will make the variable available for the command, and then go away as soon as the command finishes.

If you want to keep the variable and have it available to all future commands in your session you run the assignment as a stand alone command like so:

# for bash or zsh
$ FOO=one

# for fish
$ set FOO bar

# then use it later in your session
$ echo $FOO
one

As you can see the variable is available for the echo command run later in the session. The variable will not be available to other sessions, and will disappear when the current session ends.

Finally, you can export the variable to make it available to subprocess that are spawned from the session:

# for bash or zsh
[parent] $ export FOO=one

# for fish
[parent] $ set --export FOO one

# then spawn a subsession and access the variable
[parent] $ bash
[child] $ echo $FOO
one

What about Ansible

If you are running Ansible to orchestration your servers you might ask your self what kind of session that is and what files you should change to manipulate the environment Ansible uses on the target servers. While you could go down that road, a much simpler approach is to use the environment keyword in Ansible:

- name: Manipulating environment in Ansible
  hosts: my_hosts

  # play level environment
  environment:
    FOO: one
    BAR: two

  tasks:

    # task level environment
    - name: My task
      environment:
        FOO: uno
      some_module: ...

This can be combined with vars, environment: "{{ my_environment }}", allowing you to use group vars or host vars to adapt the environment for different servers and scenarios.

Conclusion

The environment in Linux is a complex beast but armed with the knowledge above you should be able to tame it and harvest its powers for your own benefit. The environment is populated by different files depending on the kind of session and shell used. You can temporarily set variable from a one shot command, or for the remaining duration of the session. To make subshells inherit a variable use the export keyword/flag.

Lastly, if you are using Ansible you should really look into the environment keyword before you start to experiment with the different profile and rc-files on the target system.

First impressions of moving from Docker to Podman

It’s been on the horizon for a while but when I decided to port some stuff over to RHEL 8 I was more or less forced to remove my dependency on Docker and use something else instead.

When it comes to the beef between Red Hat and Docker I’ve been on the side of Red Hat. Both for technical and philosophical reasons. Docker is a big fat daemon which I really don’t need to pull a container file from a URL, or to build a container image and save it to disk. Add to that the fact that Docker is very close minded to accepting code changes and that they once thought that just verifying the existence of a checksum as a proper image validation during pull.

But even though I knew I wanted to move away from Docker at some time, I also knew it would come with a bunch of work that I’d much rather spend adding features and fixing bugs.

Anyway, now I am porting stuff to RHEL 8 and this means I need to add support for Podman. So here I will lay out some of my experiences moving from Docker to Podman.

Background

So just to give you a little context on what I do. I develop and package IT systems. The system is installed and configured using Ansible and most services are packaged as containers. While we try to use container images from vendors, we sometimes have to resort to create our own containers. So the main focus here is on adapting our Ansible roles so they start and configure the containers using podman instead of docker.

Here are the services that I decided to port:

  • AWX (upstream for Ansible Tower)
  • Foreman (upstream for Red Hat Satellite)
  • Sonatype Nexus
  • HAProxy
  • NodePKI

Installation

This was the easy part. Podman has to be installed on the target and to do this I just added the following:

package: name=podman state=present

Ansible modules

One of the biggest issue is that there are no Podman equivalents to the Ansible modules docker_network and docker_container. There is a module podman_image though and podman_container was just merged into Ansible core. However, I cannot wait for Ansible 2.9 and need a solution today. These modules are used extensively by us to manage our containers using Ansible and having to resort to the command or shell modules really feels like a step back.

Luckily I actually found a way to make the transition much easier, using systemd services.

Cheating with Systemd services

So before I started the port to podman I decided to adjust all my roles to setup the docker containers so they are managed by systemd. This is quite simple:

Create a sysconfig file:

# {{ ansible_managed }}
C_VOLUMES="{% for x in container_volumes %}--volume {{ x }} {% endfor %}"
C_ENV="{% for k,v in container_env.items() %}--env {{ k }}='{{ v }}' {% endfor %}"
C_PORTS="{% for x in container_ports %}--publish {{ x }} {% endfor %}"
C_IMAGE="{{ container_image }}"
C_COMMAND="{{ container_cmd }}"
C_ARGS="{{ container_args }}"

Create a service unit file:

[Unit]
Description=My container
Wants=syslog.service

[Service]
Restart=always
EnvironmentFile=-/etc/sysconfig/my_service
ExecStartPre=-{{ container_mgr }} stop {{ container_name }}
ExecStartPre=-{{ container_mgr }} rm {{ container_name }}
ExecStart={{ container_mgr }} run --rm --name "{{ container_name }}" \
  $C_VOLUMES $C_ENV $C_PORTS $C_ARGS $C_IMAGE $API_COMMAND
ExecStop={{ container_mgr }} stop -t 10 {{ container_name }}

[Install]
WantedBy=multi-user.target

Start the service:

- service: name=my_service state=started

Thanks to the fact that podman is CLI-compatible with the Docker client, moving to podman is now as easy as setting container_manager to /usr/bin/podman instead of /usr/bin/docker.

Creating networks

Unfortunately Podman has no podman create network to create a private network where I can put a set of containers. This is really a shame. Docker networks makes it easy to create a private namespace for containers to communicate. Docker networks allows me to expose ports only to other containers (keeping them unexposed to the host) and name resolution so containers can find each other easily.

One alternative that was suggested to me on the Podman mailing list was to use a pod. But containers in pods share localhost which means that I run the risk of port collision if two containers use the same port. This also adds more complexity as I need to create and start/stop a new entity (the pod) which I never got working using systemd (systemd just killed the pod directly after starting it).

I also cannot use the built in CNI network, or create additional ones, since they don’t provide name resolution and I have no way of knowing the IP for a given container.

My only solution here was to skip networks all together and use host networking. It comes with some issues:

  • I still have the risk of port collision between containers
  • All ports are published and accessible from outside the host (unless blocked by a firewall)

Working on Mac

Another big thing missing from Podman is a client for macOS. While I use RHEL on all the servers (and Fedora at home) my workstation is a Macbook which means I cannot use Podman when I build containers locally, or troubleshoot podman commands locally. Luckily, I have a really streamlined development environment that makes it a breeze to quickly bring up a virtual machine running CentOS where I can play around. I do miss the ability to build containers on my Mac using Podman but since Docker and Podman both are CNI compatible I can build container images using Docker on my laptop and then manage and run them on RHEL using Podman without problems.

InSpec

My InSpec tests uses some docker resources but I decided to use the service resource instead to verify that the systemd services are running properly, and of course I have a bunch of tests that access the actually software that runs inside the containers.

Summary

So after moving to systemd services it was really easy to port from Docker to Podman. My wishlist for Podman would be the following:

  • Podman modules for Ansible to replace the Docker modules
  • Ability to manage CNI networks using podman network ...
  • Name resolution inside Podman networks
  • Support for macOS

Luckily none of these were showstoppers for me and after figuring it all out it took about a day to convert five Ansible roles from Docker to Podman without loss of end user functionality.

Ansible 2.8 has a bunch of cool new stuff

So Ansible 2.8.0 was just released and it comes with a few really nice new features. I haven’t had time to use it much, since I just upgraded like 10 minutes ago, but reading through the Release Notes I found some really cool new things that I know I’ll enjoy in 2.8.

Automatic detection of Python path

This is a really nice feature. It used to be that Ansible always looked for /usr/bin/python on the target system. If you wanted to use anything else you needed to adjust ansible_python_interpreter. No more! Now Ansible will do a much smarter lookup where it will not only look for Python in several locations before giving up, it will adapt to the system it is executing on. So for example on Ubuntu we always had to explicitly tell Ansible to use /usr/bin/python3 since there is no /usr/bin/python by default. Now Ansible will know this out of the box.

Better SSH on macOS

Ansible moved away from the Paramiko library in favor of SSH a long time ago. Except when executed on macOS. With 2.8 those of us using a MacBook will finally get some of those sweet performance improvements that SSH has over Paramiko which will mean a lot since the biggest downside to Ansible is its slow execution.

Accessing undefined variables is fine

So when you had a large structure with nested objects and wanted to access one and give it a default if it, or any parent, was undefined you needed to do this:

{{ ((foo | default({})).bar | \
default({})).baz | default('DEFAULT') }}

or

{{ foo.bar.baz if (foo is defined and \
foo.bar is defined and foo.bar.baz is defined) \
else 'DEFAULT' }}

Ansible 2.8 will no longer throw an error if you try to access an object of an undefined variable but instead just give you undefined back. So now you can just do this:

{{ foo.bar.baz | default('DEFAULT') }}

A lot more elegant!

Tons of new modules

Of course as with any new release of Ansible there is also a long list of new modules. For example the one that I am currently most interested in are the Foreman modules. Ansible comes with just a single module for Foreman / Satellite but I have been using the foreman-ansible-modules for a while now and 2.8 deprecates the old foreman plugin in favor of this collection. Hopefully they will soon be incorporated into Ansible Core so I don’t have to fetch them from GitHub and put inside my role.

There are also a ton of fact-gathering modules for Docker such as docker_volume_info, docker_network_info, docker_container_info and docker_host_info that will be great when checking and manipulating Docker objects. Although, with RHEL 8 we will hopefully be moving away from Docker so these may come a little too late to the party, to be honest.

There’s a bunch of new KubeVirt modules which may be really cool once we move over to OpenShift 4 and run some virtual machines in it.

Other noteworthy modules are:

  • OpenSSL fact gathering for certificates, keys and CSRs
  • A whole bunch of VMware modules
  • A few Ansible Tower modules
  • A bunch of Windows modules