Docker Basics: A Quick Look at Core Concepts
Why Use Docker?
To run various applications on a single server, one method is to deploy multiple VMs and allocate a VM to each application. However, Docker allows packaging and managing each application in lighter units called Containers, which are more lightweight than VMs.
What is a Container?
VM vs Container
The concept of a VM involves having multiple Guest OSes on a single Host OS, with each application mapped to a single Guest OS, whereas
[ Host OS - [ VM: Guest OS - Libs - App ] ]
A Container is a lighter unit than a VM, capable of directly running multiple applications on a single Host OS. Port forwarding or filesystem (directory) synchronization between the Host OS and a Container can be configured through image settings, which will be discussed later.
[ Host OS - [ Container: Libs - App ] ]
While VMs manage physical resources via a Hypervisor, Containers manage resources logically via Docker.
- VM: Hardware virtualization by Hypervisor
- Container: Host OS virtualization by Docker
The concept of LXC (Linux Container), which I had experience using for multi-node Hadoop configurations during my undergraduate studies for demo execution, was reportedly the initial implementation of Docker. However, Docker later transitioned to using its own container technology.
Image and Container
When first encountering Docker, there were concepts I couldn’t clearly distinguish: ‘Image’ and ‘Container’. Since the Image concept is similar to that in VMs, it might be easier to understand.
- Image is a static configuration that bundles the filesystem and all necessary settings for container execution.
- Container can be seen as a dynamic instance that is actually running (runtime) based on the Image.
Why Use Containers?
Application-Unit Management
By enabling packaging at the application unit level, we can separate roles and responsibilities (R&R) during development. When developing a web service, a single server instance often contains various roles, which can be separated into independent Containers.
- nginx: Provides static pages and serves as an SPA frontend.
- tomcat: API server to be provided to the frontend.
- logstash (Log Collection): Transmits logs generated by nginx and tomcat to a log storage server.
- prometheus (Metric Collection): Transmits error logs from nginx and tomcat, along with resource status like CPU and memory, to a status management server.
- pinpoint (Performance Measurement): Transmits API call counts and latency from tomcat to other servers to a performance management server.
Thus, as in the example above, a total of five Containers can operate on a single server instance.
If you want to add an API server that can be directly called from outside, in addition to the API server provided to the frontend, you can add another tomcat container, allowing you to use a total of two tomcats on a single server instance. You could also replace a Java-based tomcat with a Python-based Django. In this scenario, the nginx server providing the frontend remains unchanged, while only the API server is replaced.
Managing each application like Lego blocks also offers significant advantages in deployment. If you only want to update one container version, you simply pull the updated image for that container and redeploy it. This allows for separate version management for each container.
While VMs also offer the advantage of managing applications like Lego blocks, Containers are preferred even more because their higher level of virtualization makes them lightweight (Container = lightweight VM), and as explained above, version and deployment management are handled through images. This separation of ① image configuration and ② deployment simplifies automation. From a performance perspective, Containers also offer faster I/O and network processing between them. Although VMs, with their lower level of virtualization, are said to offer superior encapsulation in terms of security compared to Containers, I wonder how significant the difference is with current technology.
In this way, Docker configures the environment in which an application will run and the image to be run. The application’s own settings can be configured within the project, separate from Docker. This represents a separation of responsibilities.
Docker Terminology
- Registry = Images storage
- A central repository for storing images.
- Typically, when configuring a deployment pipeline, tomcat/nginx images generated by the Docker Engine from the latest source code are uploaded to the Registry, and then the final server deployment proceeds using those images.
- You can use the default Docker Hub server or create and use a company/personal Docker Hub server,
- Or you can use ECR (AWS EC2 Container Registry) provided by Amazon AWS.
- Image
- As explained earlier, it’s a static configuration that bundles the filesystem and necessary settings for container operation.
- An Image is a collection of RO (Read-Only) filesystems.
- For a more detailed filesystem structure, please refer to the following.
- Container
- A dynamic instance actually running (runtime) based on the Image mentioned above.
- Application / Service = Containers on One host
- For this, Docker Compose is used to manage Containers on a single host machine.
- Orchestration = Containers on Multiple hosts (Systems, MSA)
- For this, Docker Swarm is used to manage Containers across multiple host machines.
Docker Engine
It is an engine responsible for both ① Image creation and ② Container execution, and its composition is as follows:
- Docker CLI that receives user input for container and image creation.
- Docker Daemon for container execution.
Image Creation
Since containers operate based on images, you must first create the desired image before running the desired container. The process from image creation to final container execution consists of three steps:
- Dockerfile - Writing a Dockerfile
You write settings (creation rules) for generating the desired image in a Dockerfile using various commands. Based on these settings, an image is created, and this created image will later be used to run a container. Below is a collection of simple commands:
FROM: Defines the base image. Specify the URL of the image to retrieve.
ENV: Sets environment variables within the image. Think ofSET_VALUE=3&echo $SET_VALUEin a Linux terminal.
RUN: Specifies a Shell command to execute. This command is performed at the time of image build.
CMD: Specifies a Shell command to execute. This command is performed when the container is successfully run after image build completion.
EXPOSE: Sets the port you want to expose externally. Connects the container port with the port to be exposed on the actual Host.
WORKDIR, ENTRYPOINT: Specifies the directory location where the Shell commands defined by RUN/CMD will be executed.
ADD, COPY: Commits a directory or file from the host to the image.
VOLUME: Connects a directory or file from the host to the container directory without committing it to the image.
… Refer to the official Docker documentation for more commands and detailed explanations.
- Build (docker build) - Image Creation
When you execute the docker build command, the Dockerfile you’ve written is first sent to the Docker Daemon. Then, for each command within the Dockerfile script, a container is started for execution. If the command succeeds, an image is created from its snapshot. Observing the docker build execution log shown in the example below, you can see that Docker returns both the ID of the container where each Dockerfile command is executed and, upon completion, the image ID generated from the snapshot of the completed container.
If a command fails during execution, you can access the container ID where that command was running via a shell to check the logs. This way, docker build debugging is possible using the intermediate container IDs returned. Therefore, the snapshot of the container after the last line of the Dockerfile script has successfully executed becomes the final image we create.
- The build starts by submitting the Dockerfile to the Docker Daemon.
The Docker Daemon retrieves the base image that will serve as the foundation for the new image, as specified in the FROM command within the Dockerfile.
$ docker build .
Sending build context to Docker daemon 10240 bytes
Step 1/3 : FROM base-image:1.7.2
Pulling repository base-image:1.7.2
---> e9aa60c60128/1.000 MB (100%) endpoint: https://my-own.docker-registry.com/v1/ // [!code highlight]
The base-image:1.7.2 image was pulled from the personal Docker Registry at https://my-own.docker-registry.com/v1/. The last line, e9aa60c60128, is the ID Docker assigned to the downloaded base image. The next command to be executed will create an intermediate image based on this image.
- The next command launches the previously created intermediate image as a container, executes the commands, and then returns a snapshot as a new image.
Step 2/3 : WORKDIR /instance
---> Running in 9c9e81692ae9
Removing intermediate container 9c9e81692ae9
---> b35f4035db3f
As a result of the previously executed FROM command, the intermediate image e9aa60c60128 was created. A new container 9c9e81692ae9 was launched using this image, and after executing the WORKDIR /instance command inside it, the completed container was removed, and its snapshot was returned as the b35f4035db3f image.
- The snapshot image generated after completing all steps in the Dockerfile becomes the final image we obtain.
Step 3/3 : CMD echo Hello world
---> Running in 02071fceb21b
Removing intermediate container 02071fceb21b
---> f52f38b7823e
Successfully built f52f38b7823e
If you want to assign a desired name to the final image (ID) f52f38b7823e instead of its ID, you can do so using the tag option. For example, since a new image was built from base-image:1.7.2, you might name it custom-image:1.7.2.
- Push (
docker push) - Finally, this command stores the created image in the Docker Registry.
Container Execution
The final Image created is used to run a Container on the Docker Daemon.
- Pull (
docker pull) - Retrieves a stored image to run a container. - Execute (
docker run) - Runs a container using the retrieved image.
Docker Image Configuration Examples
To provide a service for storing/retrieving product information, we aim to offer the service with an nginx frontend server (React.js) and a tomcat backend server (Java). We will run these two applications as separate containers, totaling two containers, on a single AWS EC2 server instance.
Dockerfile Example for nginx
First, let’s look at the nginx image configuration. Running nginx involves executing a shell script. Our goal is to inject a custom replace-hosts-and-run.sh script into the image and execute it with appropriate environment variables to finally launch the nginx server.
# 1. Get the base image. Pull the default nginx image for the frontend server.
FROM http://docker-hub.aaronryu.com/nginx:1.8.0
# 2. Install gettext for multilingual support on the nginx web server.
RUN apk --no-cache add gettext
# 3. Add/copy the current project's files/, build/ directories, and shell script to the specified directory within the image.
ADD files/ /instance/program/nginx/conf
ADD build/ /instance/service/webroot/ui
ADD replace-hosts-and-run.sh /instance/program/nginx/replace-hosts-and-run.sh
# 4. Set the hostname environment variable to be used in the above shell script (replace-hosts-and-run.sh).
ENV NGINX_HOST aaronryu.frontend.com
# 5. Connect the following directory within the nginx container to a host directory for logging, etc.
# (Operations performed by the Container on the directory below will be reflected in the actual host directory.)
VOLUME ["/instance/logs/nginx"]
# 6. After 'image completion', execute (CMD) the shell script copied earlier, along with the environment variable.
CMD /instance/program/nginx/replace-hosts-and-run.sh
Dockerfile Example for tomcat
To query and save data from the nginx server’s SPA static pages, a corresponding API is required. We will launch a tomcat server to provide these APIs. Since it’s a Java server, we’ll add JVM settings and open port 12345 to allow external status checks of this server.
# 1. Get the base image. Pull the default tomcat image for the backend server.
FROM http://docker-hub.aaronryu.com/tomcat:8.0.0-jdk8
# 2. The tomcat implementation is with Spring Boot. We will provide the 'production' profile option during execution.
ENV SPRING_PROFILE production
# 3. Since tomcat is a Java-based server, add JVM memory options.
ENV JVM_MEMORY -Xms2g -Xmx2g -XX:PermSize=512m -XX:MAxPermSize=512m
# 4. Add/copy the setenv.sh stored in the current project directory to the tomcat execution shell file within the image.
ADD setenv.sh ${CATALINA_HOME}/bin/setenv.sh
# 5. After the current project build is complete, add/copy all generated .war files to tomcat's webapps directory.
COPY build/libs/*.war "${CATALINA_HOME}" /webapps/ROOT.war
# 6. Connect the configured tomcat server port 8080 to host port 12345 to expose it externally.
EXPOSE 8080 12345
# 7. Connect the following directories within the tomcat container to host directories for logging, etc.
# (Operations performed by the Container on the directories below will be reflected in the actual host directories.)
VOLUME ["/instance/logs/tomcat", "/instance/logs/tomcat/catalina_log", "/instance/logs/tomcat/gc"]
Each Dockerfile examined in the examples above will be located within its respective nginx and tomcat projects. To launch these two containers simultaneously on a single instance, you would bundle and specify each container’s image using Docker Compose configuration (e.g., a .yml file).
- https://medium.com/@darkrasid/docker%EC%99%80-vm-d95d60e56fdd
- https://docs.docker.com/storage/storagedriver/#images-and-layers
- https://rampart81.github.io/post/docker_image/
- https://www.quora.com/What-is-the-difference-between-the-Docker-Engine-and-Docker-Daemon
- https://www.joyfulbikeshedding.com/blog/2019-08-27-debugging-docker-builds.html