Kubernetes will manage your application container, hence before containerizing your application or writing pod/deployment/service configuration files, you should implement application-level changes to maximize your app’s portability and observability in Kubernetes.
Kubernetes can automatically deploy and restart failing application containers, so it’s really important to build in the application logic to communicate with this container orchestrator and allow it to automatically scale your application as necessary.
Configuration data in your application container are various variables/options like database credentials, 3rd application config. For example, your application normally will have at least two environments (staging and production), these environments will have different database credentials. Hard coding configuration values will increase security risks. They are sensitive information.
Container software like Docker or cluster software Kubernetes have been designed that can help you extracting configuration values, stored in a separate location and your app can be deployed in any environment. For example, in Docker-Compose you can pass multiple environment variables from an external file through to a service’s containers with the
env_file option, and in Kubernetes, Kubernetes Secret objects let you store and manage sensitive information. I wrote a blog that shares about how I manage sensitive information of your application in Kubernetes
Your application in Kubernetes or any container orchestrator can be load balancer across multiple replicas, able to scale up/down, with minimal or no disruption of service for clients. To enable this horizontal scale, applications must be designed in a stateless.
Application state will not be stored locally and at any time if the running application is crashed/destroyed/restarted, critical data is not lost.
Practically, we are usually set up our application on services such as AWS, GCP, Azure. These service intended to help us build a stateless application. For example, my customer had a ticketing system application, I have setup Kubernetes cluster by using AWS EKS, run their application Kubernetes and store their data in an RDS Postgres database and ElastiCache Redis.
But for some reason, we want a stateful application, then how? Kubernetes had a builds in features for attaching persistent block storage volumes to containers and Pods. To ensure that a Pod can maintain state and access the same persistent volume after a restart, the StatefulSet workload must be used.
For example, back to the ticketing system, this system had two environments (staging and production). And in the staging environment, I used StatefulSet for database storage. StatefulSets are ideal for deploying databases.
How Kubernetes know which time to restart your application container? The answer is Health checks.
By default, when your application running in Kubernetes, Kubernetes sees your container as “healthy”. For whatever reason, your application is deadlocked/crashed/not performing any work, Kubernetes will see your container as “not healthy”, and automatically trigger an action, e.g restart your container.
There are two kind of health checks in Kubernetes: Readiness probe and Liveness probe
Basically, Readiness probe lets Kubernetes know when your application is ready to receive traffic, and Liveness probe lets Kubernetes know when your application is healthy and running.
For example, I had an API backend application, the Liveness probe will check whether my application is running or not and the Readiness probe will check whether the API is ready to connect, by making a simple GET request.
Logging data to monitor and debug your application’s performance. Building in features to publish performance metrics like response duration and error rates will help you monitor your application and alert you when your application is unhealthy.
In the application level, you should plan how your application will log in a distributed cluster-based environment. You should ideally remove hardcoded configuration references to local log files and log directories, and instead log directly to stdout and stderr. The output stream will then get captured by the container enveloping your application, from which it can be forwarded to a logging layer like the EFK (Elasticsearch, Fluentd, and Kibana) stack.
For example, in Ruby on Rails application, to stdout application log, you can enable via an environment variable called
ENV["RAILS_LOG_TO_STDOUT"] by setting it to be true.
Before creating a Dockerfile, the first step you need to know which dependencies your application needs to run correctly. Avoid
latest tag and unversioned packages as much as possible, because maybe one day in the future, there is a bug/security risk from the latest version that you can’t control.
When deploying and pulling container images, large images can significantly slow things down and add to your bandwidth costs. By apply multi-stage pattern when building container images, you can reduce image sizes, and speed up image builds.
For example, I had an application built by Elixir-Phoenix framework, and I need to compile the application before running it, I will split the Dockerfile to multi-stages (One for build time and One for run time). The first stage will use the
elixir:alpine image, setup needed dependencies, and run the command to compile the application project, then the second stage will copy the compiled application to an
alpine image before running it.
slim image as much as possible and clean up unnecessary files/artifacts after installing the software.
Once you’ve built your application images, to make them available to Kubernetes, you should upload them to a container image registry. There are a lot of registries such as DockerHub, AWS ECR, GCP Container Registry, GitHub Registry, Quay.io.
Personally, I usually push the container image to AWS ECR. In the next few days, I’ll write another post to show how I made it.
To manage builds and continuously publish containers containing your latest code changes to your image registry, you should use a build pipeline.
Right now, I’m using Circle CI with a build pipeline concretely. For example, when the code is pushed to develop branch, the Circle CI will trigger processes: run the application unit test, helm chart unit test and when these processes are passed, the CI will run the build process to build, tag, push container image to AWS ECR.
Alright, that’s all changes at application-level that you need to prepare before writing configuration files to run your application container on Kubernetes.
My post is based on the post from DigitalOcean. I read the DigitalOcean blog and adding more examples in my real projects.
In the next few days, I will share the way I’m using to write configuration files for application containers.
See you soon!