Twitter joins the rest of the world – moves to Kubernetes

Twitter Inc / The Linux Foundation

Zhang Lei, Senior Technical Experts at Alibaba reports that David McLaughlin, Product and Technical Head of Twitter Computing Platform, has announced that Twitter is switching away from Apache Mesos to Kubernetes.

You would be forgiven for thinking that Twitter was already using Kubernetes to manage all its services, given that it’s used by Netflix, Google and Facebook among many others. But Twitter has actually been using Apache Mesos, a competitor to Kubernetes.

The biggest difference between Mesos and Kubernetes is that Mesos is much more ambitious and complex. This means that it is harder to get started with Mesos rather than Kubernetes. But this isn’t likely to affect Twitter as they already have ton of experience with Mesos (they have been heavily involved in developing it) already. The biggest problem, that also affects Twitter, is the size of the open source community around both projects. Mesos is much smaller than Kubernetes meaning there are fewer developers working on it, fewer companies using it and sharing their experiences, fewer experts ready to answer questions, and so on. This is a big deal and probably the main reason why Twitter is making the switch.

Hopefully this means that Twitter will bring some of their expertise and skills to the Kubernetes community and help develop the project even further.

If you’re using netstat you’re doing it wrong – an ss tutorial for oldies

Become a modern master with some serious ss skills

If you are still using netstat you are doing it wrong. Netstat was replaced by ss many moons ago and it’s long overdue to throw out the old and learn how to get the same result but in a whole new way. Because we all love to learn stuff just for the fun of it, right.

But seriously, ss is way better than nestat because it talks to the kernel directly via Netlink and can thus give you much more info than the old netstat ever could. So to help old folks like me transition from netstat to ss I’ll give you a translation table to port you over. But first, in case there are some newcomers whom isn’t encumbered with old baggage I’ll quickly describe a few common tasks you can do in ss.

Check open ports that someone is listening to

One of my most common use cases is to see if my process is up and running and listening to connections, or if there’s is something listening to a port I wanna know who it is. To do this use the flags --listening to get sessions with the LISTEN state, --processes to get the process that is listening, and to clean up we use --numeric since I never remember that sunrpc means port 111:

$ ss --listening --tcp --numeric --processes
State     Recv-Q  Send-Q  Local Address:Port    Peer Address:Port                                                                                    
LISTEN    0       128     0.0.0.0:111           0.0.0.0:*                                                                                       
LISTEN    0       128     127.0.0.1:27060       0.0.0.0:*        users:(("steam",pid=29811,fd=45))                                              
LISTEN    0       10      0.0.0.0:57621         0.0.0.0:*        users:(("spotify",pid=11223,fd=106))                                           
LISTEN    0       32      192.168.122.1:53      0.0.0.0:*                                                                                       
LISTEN    0       128     0.0.0.0:22            0.0.0.0:*                                                                                       
LISTEN    0       5       127.0.0.1:631         0.0.0.0:*                                                                                       
LISTEN    0       128     0.0.0.0:17500         0.0.0.0:*        users:(("dropbox",pid=13706,fd=98))                                            
LISTEN    0       128     0.0.0.0:27036         0.0.0.0:*        users:(("steam",pid=29811,fd=82))                                              
LISTEN    0       128     127.0.0.1:57343       0.0.0.0:*        users:(("steam",pid=29811,fd=39))

Check active connections

Checking just active sessions is easy. Just type ss. If you want to filter and show only TCP connection use the --tcp flag like so:

$ ss --tcp
State        Recv-Q   Send-Q   Local Address:Port     Peer Address:Port     
ESTAB        0        0        192.168.1.102:57044    162.125.18.133:https    
ESTAB        0        0        192.168.1.102:34008    104.16.3.35:https    
CLOSE-WAIT   32       0        192.168.1.102:52008    162.125.70.7:https

The same goes for UDP and the --udp flag.

Get a summary

Instead of listing individual sessions you can also get a nice summary of all sessions by using the --summary flag:

$ ss --summary
Total: 1625
TCP:   77 (estab 40, closed 12, orphaned 0, timewait 6)

Transport Total     IP        IPv6
RAW       0         0         0        
UDP       33        29        4        
TCP       65        59        6        
INET      98        88        10       
FRAG      0         0         0

Translation table going from netstat to ss

Lastly, as promised here is a nice table to help you transition. Believe me, it’s quite easy to remember.

netstat -ass
netstat -auss -u
netstat -ap | grep sshss -p | grep ssh
netstat -lss -l
netstat -lpnss -lpn
netstat -rip route
netstat -gip maddr

Performance analysis between RHEL 7.6 and RHEL 8.0

Apart from all the new cool features in the freshly released Red Hat Enterprise Linux 8 one thing that is just as important is the improvements in performance. The team over at Red Hat has performed a bunch of benchmark tests on both RHEL 7.6 and RHEL 8.0 and the results show some really nice improvements.

Overall the performance looks good. The chart below shows around 5% improvement in CPU, 20% less memory usage, 15% increased disk I/O, and around 20-30% improved network performance.

a candlestick chart which combines multiple tests
Photo: Red Hat

Looking at more specific metrics we see a 40% increase in disk throughput on the XFS file system as shown in the chart below.

RHEL 7.6 vs RHEL 8 AIM7 shared throughput - XFS
Photo: Red Hat

If you are running OpenStack the network control plane will also see a large improvement when moving to RHEL 8. Read the full article at redhat.com for more details.

Quick overview of the new features in Kubernetes 1.15

Kubernetes 1.15 has been released and it comes with a lot of new stuff that will improve the way you deploy and manage services on the platform. The biggest highlights are quota for custom resources and improved monitoring.

Quota for custom resources

We have had quota for native resources for a while now but this new release allows us to create quotas for custom resources as well. This means that we can control Operators running on Kubernetes using quotas. For example you could create a quota saying each developer gets to deploy 2 Elasticsearch clusters and 10 PostgreSQL clusters.

Improved monitoring

Whether you run a production cluster, or a lab where you test stuff out, it is important to have proper monitoring so you can detect issues before they become problems. Kubernetes 1.15 comes with support for third party vendors to supply device metrics without having to modify the code of Kubernetes. This means that your cluster can use hardware specific metrics, such as GPU metrics, without needing explicit in Kubernetes for that specific device.

The metrics for storage has also improved with support for monitoring of volumes from custom storage providers.

Lastly, the monitoring performance has improved since only the core metrics are collected by kubelet.

More info

How to get it

Most users consume Kubernetes as part of a distribution such as OpenShift. They will have to wait until that distribution upgrades to Kubernetes 1.15. The latest version of OpenShift, version 4.1, comes with Kubernetes 1.13 and I would expect Kubernetes 1.15 to be available in OpenShift 4.3 which should arrive in the beginning of 2020.

Principles of container-based application design

“Principles of software design:

  • Keep it simple, stupid (KISS)
  • Don’t repeat yourself (DRY)
  • You aren’t gonna need it (YAGNI)
  • Separation of concerns (SoC)

Red Hat approach to cloud-native containers:

  • Single concern principle (SCP)
  • High observability principle (HOP)
  • Life-cycle conformance principle (LCP)
  • Image immutability principle (IIP)
  • Process disposability principle (PDP)
  • Self-containment principle (S-CP)
  • Runtime confinement principle (RCP)”

After the move to Infrastructure-as-Code and containerization it is only natural we start to apply some of the lessons we learned during software development, to building our infrastructure.

Read more at redhat.com.

What’s new in Linux kernel 5.2

A new version of the Linux kernel has just been released. Here’s a short summary of the new stuff that might be interesting for end users.

  • Logitech
    • Support for MX5500
    • Support for S510 
    • Support for Unifying receiver
    • Viewing battery status
  • Realtek
    • Support for RTL8822BE
    • Support for RTL8822CE
  • SoC
    • Support for Nvidia Jetson Nano
    • Support for Orange Pi RK3399
    • Support for Orange Pi 3
  • Nvidia
    • Noveau supports GeForce GTX 1650
  • Intel
    • Support for Intel Comet Lake
    • Support for Intel Icelake graphics
    • Support for Intel Elkhart Lake graphics
    • Hibernation in Cherrytrail and Baytrail
    • Support for Thunderbolt on older Apple hardware
  • AMD
    • Improved support for Ryzen
    • Improved support for Radeon X1000
    • Support for upcoming EPYC
  • ARM
    • Spectre mitigation
    • Support for ARM Mali
  • Other
    • Improved support for DisplayPort over USB-C

First impressions of moving from Docker to Podman

It’s been on the horizon for a while but when I decided to port some stuff over to RHEL 8 I was more or less forced to remove my dependency on Docker and use something else instead.

When it comes to the beef between Red Hat and Docker I’ve been on the side of Red Hat. Both for technical and philosophical reasons. Docker is a big fat daemon which I really don’t need to pull a container file from a URL, or to build a container image and save it to disk. Add to that the fact that Docker is very close minded to accepting code changes and that they once thought that just verifying the existence of a checksum as a proper image validation during pull.

But even though I knew I wanted to move away from Docker at some time, I also knew it would come with a bunch of work that I’d much rather spend adding features and fixing bugs.

Anyway, now I am porting stuff to RHEL 8 and this means I need to add support for Podman. So here I will lay out some of my experiences moving from Docker to Podman.

Background

So just to give you a little context on what I do. I develop and package IT systems. The system is installed and configured using Ansible and most services are packaged as containers. While we try to use container images from vendors, we sometimes have to resort to create our own containers. So the main focus here is on adapting our Ansible roles so they start and configure the containers using podman instead of docker.

Here are the services that I decided to port:

  • AWX (upstream for Ansible Tower)
  • Foreman (upstream for Red Hat Satellite)
  • Sonatype Nexus
  • HAProxy
  • NodePKI

Installation

This was the easy part. Podman has to be installed on the target and to do this I just added the following:

package: name=podman state=present

Ansible modules

One of the biggest issue is that there are no Podman equivalents to the Ansible modules docker_network and docker_container. There is a module podman_image though and podman_container was just merged into Ansible core. However, I cannot wait for Ansible 2.9 and need a solution today. These modules are used extensively by us to manage our containers using Ansible and having to resort to the command or shell modules really feels like a step back.

Luckily I actually found a way to make the transition much easier, using systemd services.

Cheating with Systemd services

So before I started the port to podman I decided to adjust all my roles to setup the docker containers so they are managed by systemd. This is quite simple:

Create a sysconfig file:

# {{ ansible_managed }}
C_VOLUMES="{% for x in container_volumes %}--volume {{ x }} {% endfor %}"
C_ENV="{% for k,v in container_env.items() %}--env {{ k }}='{{ v }}' {% endfor %}"
C_PORTS="{% for x in container_ports %}--publish {{ x }} {% endfor %}"
C_IMAGE="{{ container_image }}"
C_COMMAND="{{ container_cmd }}"
C_ARGS="{{ container_args }}"

Create a service unit file:

[Unit]
Description=My container
Wants=syslog.service

[Service]
Restart=always
EnvironmentFile=-/etc/sysconfig/my_service
ExecStartPre=-{{ container_mgr }} stop {{ container_name }}
ExecStartPre=-{{ container_mgr }} rm {{ container_name }}
ExecStart={{ container_mgr }} run --rm --name "{{ container_name }}" \
  $C_VOLUMES $C_ENV $C_PORTS $C_ARGS $C_IMAGE $API_COMMAND
ExecStop={{ container_mgr }} stop -t 10 {{ container_name }}

[Install]
WantedBy=multi-user.target

Start the service:

- service: name=my_service state=started

Thanks to the fact that podman is CLI-compatible with the Docker client, moving to podman is now as easy as setting container_manager to /usr/bin/podman instead of /usr/bin/docker.

Creating networks

Unfortunately Podman has no podman create network to create a private network where I can put a set of containers. This is really a shame. Docker networks makes it easy to create a private namespace for containers to communicate. Docker networks allows me to expose ports only to other containers (keeping them unexposed to the host) and name resolution so containers can find each other easily.

One alternative that was suggested to me on the Podman mailing list was to use a pod. But containers in pods share localhost which means that I run the risk of port collision if two containers use the same port. This also adds more complexity as I need to create and start/stop a new entity (the pod) which I never got working using systemd (systemd just killed the pod directly after starting it).

I also cannot use the built in CNI network, or create additional ones, since they don’t provide name resolution and I have no way of knowing the IP for a given container.

My only solution here was to skip networks all together and use host networking. It comes with some issues:

  • I still have the risk of port collision between containers
  • All ports are published and accessible from outside the host (unless blocked by a firewall)

Working on Mac

Another big thing missing from Podman is a client for macOS. While I use RHEL on all the servers (and Fedora at home) my workstation is a Macbook which means I cannot use Podman when I build containers locally, or troubleshoot podman commands locally. Luckily, I have a really streamlined development environment that makes it a breeze to quickly bring up a virtual machine running CentOS where I can play around. I do miss the ability to build containers on my Mac using Podman but since Docker and Podman both are CNI compatible I can build container images using Docker on my laptop and then manage and run them on RHEL using Podman without problems.

InSpec

My InSpec tests uses some docker resources but I decided to use the service resource instead to verify that the systemd services are running properly, and of course I have a bunch of tests that access the actually software that runs inside the containers.

Summary

So after moving to systemd services it was really easy to port from Docker to Podman. My wishlist for Podman would be the following:

  • Podman modules for Ansible to replace the Docker modules
  • Ability to manage CNI networks using podman network ...
  • Name resolution inside Podman networks
  • Support for macOS

Luckily none of these were showstoppers for me and after figuring it all out it took about a day to convert five Ansible roles from Docker to Podman without loss of end user functionality.