Kafka can be used as the backbone for integrating microservices using event sourcing and CQRS. It’s a proven workhorse for pub/sub messaging, streaming data, and processing events in real time.
If you’re not familiar with CQRS, it is an architecture pattern that describes using one model for writing data while using another for reading data. One of the most common examples is having specific view models of data tailored to the screens of a system. I’ll let the renowned Martin Fowler explain the more intricate details: https://martinfowler.com/bliki/CQRS.html
Event Sourcing is another architecture pattern that describes storing the state of a system as a chronologically ordered series of messages rather than the traditional database that (in most circumstances) has no concept of the passing of time and cannot provide the past state of a system, only the present state.
It is event sourcing that is of interest to me because it solves many of the most complicated problems systems-at-scale encounter: auditing, inter-process/inter-system communications, messaging and guaranteed delivery, maintaining an authoritative data source, syncing data across multiple distributed systems while keeping coupling low, and ensuring consistent, atomic transactions. For some uses, event sourcing may not be ideal because it gurantees eventually consistent transactions, not immediately consistent.
My Development Setup
I prefer to avoid developing directly on my physical machine, so I do all of my experimenting on VirtualBox VMs. To get started, I did the following:
- Create a VirtualBox VM with the following:
- Ubuntu Server (headless)
- 20GB disk space
- At least 2 CPU cores
- At least 2GB RAM
- 2 Network Adapters
- Host-only Adapter
- Install OpenSSH server and client
- Install Docker
I prefer to SSH into the box from another VBox VM that has Ubuntu Desktop running, but you could certainly SSH into it from your host machine or through VirtualBox.
And now the fun begins! Here’s a brief run-down of what will follow:
- Install Docker
- Docker Post-installation Steps
- Initialize your instance of Docker as a swarm manager
- Install Portainer
First of all, learn from my experience and save yourself some pain: do not use snap or apt to install Docker. Just don’t do it. These two package managers use a series of soft-links when installing applications and Docker does not like that.
Instead, follow the instructions on Docker’s site: Get Docker Engine – Community for Ubuntu
Docker Post-installation Steps
Unless you install Docker as root and plan on using Docker as root, follow Docker’s post-installation steps: Post-installation steps for Linux.
I only needed to implement the first two sections of these post-installation steps: Manage Docker as a non-root user and Configure Docker to start on boot.
Here’s a couple of notes to keep in mind when you’re working through those two sections:
- When you get to step 3 of the Manage Docker as a non-root user section, just restart your machine. I tried logging out and logging in. I tried running the
$ newgrp dockercommand. Nothing worked except a restart.
- When you get to the Configure Docker to start on boot section, if you’re using Ubuntu you’ll want to follow the
Initialize Docker as a Swarm Manager
The next step is to activate Docker Swarm. Swarm is Docker’s native container orchestration platform (similar to Kubernetes). We need to activate it to enable Docker’s stack feature that allows you to spin up a group of related containers using a single YAML configuration file. This works nearly identically to Docker Compose, but without having to install Docker Compose.
Here’s the command:
$ docker swarm init
Note that if you have more than one network interface attached to your container (e.g. a bridge connection and a host-only connection), you need to add an additional option to the command above to specify which IP address you want your Docker engine to listen on for Swarm commands. Of course, replace my dummy IP address with your own.
$ docker swarm init --advertise-addr 192.168.1.102
Portainer is a powerful, yet simple, management tool for Docker (including Docker Swarms and Stacks). It is installed as a Docker container and accessed via your web browser.
I chose to install Portainer because it allows me to monitor the status of my Kafka stack (that we’ll install in the next section of this post).
One thing to note about Portainer is its stack management only recognizes Docker Compose file format version 2. The Docker Compose file below is version 3, so we will use the Docker CLI to install it. Once installed, you can monitor it through Portainer.
Without further ado, let’s install Portainer. First, either SSH into your Ubuntu Server or access it directly through VirtualBox.
Now, run the following command:
$ docker run --name portainer --restart unless-stopped -d -p 8000:8000 -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer
This command did the following:
- Create a new container named “portainer” from the image downloaded from the “portainer/portainer” project at https://hub.docker.com
- If the container dies for some reason, restart it every time unless you explicitly stop it
- Run it as a background daemon
- Bind ports 8000 and 9000 to the host machine’s ports 8000 and 9000
- Map the host machine’s docker socket to Portainer’s so it can control the host machine’s Docker engine rather than the container’s
- Map the Docker volume “portainer_data” to the “/data” directory of the container so that Portainer’s configurations are stored on the host and can be reused by other Portainer containers if the current one is lost
This command should print a long GUID to the screen similar to this:
To confirm that Portainer started successfully, enter the following command:
$ docker ps
This prints all the running containers to your screen (similar to Linux’s
ps command). You’ll see output similar to this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 055e18b050e6 portainer/portainer "/portainer" 3 seconds ago Up 1 second 0.0.0.0:8000->8000/tcp, 0.0.0.0:9000->9000/tcp portainer
The output above tells us that the
portainer container is running and has made ports 8000 and 9000 publicly available to incoming connections on the Ubuntu Server hosting Docker. This means that you can access Portainer via port 9000 from another Docker container or from your physical host machine (in my case from Windows).
You need to find out what the Ubuntu Sever’s IP address is to access Portainer. In your terminal, enter the command below:
$ ip addr
If you did like me and added both a NAT and Host-only Adapter to your Ubuntu Server VirtualBox, you’ll have 2 possible IP addresses you can use — a 10.0.#.# and a 192.168.#.# address. The screen shot below shows the relevant output from my terminal:
As you can see, their names will most likely begin with
enp0s and be followed by an integer. You can locate your IP address(es) by looking for the word
inet that should be followed by an IP (you can ignore the /24 CIDR code).
In my case, I can open a browser on my Windows physical host machine and access Portainer by keying in
192.168.56.15:9000. Docker connects to one IP or the other when it first starts. Additional IPs can be added later, if necessary. If the 192 IP doesn’t work for you, try the 10 IP.
Ultimately, you should see a screen like the following:
Ensure you choose “Local.” When you click “Connect,” you’ll be prompted for your admin credentials:
Finally, you’ll see the Portainer main screen where you can click “Containers” in the nav bar and you’ll see your Portainer instance listed.
Installing the Kafka Stack
To install the Kafka Stack, you can either manually install and configure each container required, or you can use my pre-built Docker Compose file. I suggest the latter, as it will save you a lot of time and heartache.
You may download the file here: docker-compose.yml
And here are the file contents:
version: '3' services: zookeeper: image: 'bitnami/zookeeper' ports: - '2181:2181' volumes: - 'zookeeper_data:/bitnami' environment: - ALLOW_ANONYMOUS_LOGIN=yes networks: - kafka-net deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s zookeeper-navigator: image: 'elkozmon/zoonavigator' ports: - '9090:9090' environment: - HTTP_PORT=9090 depends_on: - zookeeper networks: - kafka-net deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s kafka: image: 'bitnami/kafka' volumes: - 'kafka_data:/bitnami' ports: - '9092:9092' - '29092:29092' environment: - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181 - ALLOW_PLAINTEXT_LISTENER=yes - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,PLAINTEXT_HOST://:29092 - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092 depends_on: - zookeeper networks: - kafka-net deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s kafdrop: image: 'obsidiandynamics/kafdrop' ports: - '9094:9094' environment: - KAFKA_BROKERCONNECT=kafka:9092 - SERVER_PORT=9094 depends_on: - kafka networks: - kafka-net deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s volumes: zookeeper_data: driver: local kafka_data: driver: local networks: kafka-net: driver: overlay ipam: config: - subnet: 220.127.116.11/16
You’re welcome to read through that if you want. Here’s the TL;DR version of it:
- Create 4 containers (Zookeeper, Zookeeper Navigator, Kafka, and Kafdrop)
- Create 2 volumes on the Docker host for Zookeeper and Kafka
- Create an overlay network for all of these containers to communicate on
In case you’re wondering, Zookeeper Navigator and Kafdrop are web-based UI administration tools for Zookeeper and Kafka. I will let you do the research on how to use them.
Move the docker-compose.yml File onto Ubuntu Server
Your first task is to get the file provided above onto your Ubuntu Server Docker container. My preferred method is to use SCP.
If you use SCP, first
cd into the directory where you downloaded the file above. Then you can use my command below as a starting point. You must replace
0.0.0.0 with the username and IP address of your Ubuntu Server. This file will be copied into the user’s home directory on your Ubuntu Server.
$ scp docker-compose.yml email@example.com:~/docker-compose.yml
Create a new Swarm Stack Using docker-compose.yml
To use the file we copied over to your server, you need to SSH into that server or access it directly through VirtualBox. Next, navigate to the directory where you copied the
docker-compose.yml file to (if you used my command above, it will be in your user’s home directory).
To deploy the Kafka stack, run the command below:
docker stack deploy -c docker-compose.yml kafka-stack
This command will deploy the 4 containers, network, and 2 volumes described in the
docker-compose.yml file as a single Docker stack named
This command may take several minutes to complete. It has to download the images for all 4 containers, create the network and volumes, and start each container in order.
Finally, to verify everything went swimmingly, log into your Portainer console, and choose “Stacks” from the nav bar. Select “kafka-stack.” Expand all the nodes in the “Services” table.
Do not be alarmed if you see several red entries that indicate a failure to start. This is normal. Eventually, you should see a single green entry for each of the services. If not, you may need to remove the stack and re-deploy it. I will leave that as an exercise for you.
At this point, you should have a Kafka stack running complete with Apache Zookeeper and web UI tools for managing both Zookeeper and Kafka.
In my next post, I will talk about using some simple Python libraries to interact with Kafka.
Clinton is a full-time Software Developer currently working for CGI Federal, Inc. He spends most his days building Java web applications using tools like Spring MVC, Java Server Faces, and VueJS. In his free time, he likes to dabble in Golang, Hadoop, and other cool technologies. Clinton has been married to his wonderful wife Ashley for 8 years. Together, they have a super-handsome, unbelievably cute (no bias here folks) 6 year old son, Andrew.