Docker crash course (1)

Understand major concepts in docker in 10 minutes

Johnny Lai
Analytics Vidhya

--

The Neuschwanstein castle, Germany

What is docker?

A tool to managing containers which is a separate environment contains some pieces of code or a software and its dependencies, so it runs independently no matter how its host is and always produce the same result.

Isn’t virtual machine can achieve the same result?

Yes, they can achieve similar result but there are big difference between them

  1. Resources consumption
    Virtual machine have their own OS, it is a computer inside computer, while docker share the same kernal with the host OS. As a result, virtual machine requires more resources than docker.
  2. Infrastructure as code
    Docker image are built according to the dockerfile — it is a file contains human readable scripts, same scripts means same image. Just like other coding, development team can be collaboration effectively. In contrast, different virtual machines just like different computers, it is not easy to compare them and troubleshooting
  3. Shareable
    There is built-in docker hub support, you can easily upload your docker image to repository and share to your team or public. Of coz, you can just share the dockerfile, then other side can build the exact same image with it.
    However, if you want to share a virtual machine, you have to export it and copy the files physically, because it includes the OS, usually the file size are very large.
  4. Community support
    Docker provides a lot of base images, like Java, mongodb, nodejs. They’re officially verified and we can just build our stuff on top of these base images, it save a lot of time and effort

Some basic concepts and terms

Image
It is the blueprint / template of the containers, it contains some pieces of code or a software and its dependencies. Once the image built, it is read-only. If anything change, it have to be rebuild.

Layer based architecture
Images are build according to the dockerfile, and each line of command inside would be new layer on top. All layer are cached, but if modified the any command, then in the rebuild process, the later layers can’t re-use the cache.

Container
Running instances of Image, it built a layer on top of the image, different with image, this extra layer allows read and write. Whenever we run our code in the container, the result will be reflected in this layer

Docker storage

Normally, those data produced by running the container will be lost after we remove the container. So docker introduce “volume” — which allows container to access the folder in host machine

  1. Anonymous volume
    Storage space in host machine linked with specific containers, not visible to us in host machine. It will be removed if that container removed.
  2. Named volume
    Similar with Anonymous volume, also not visible to us, but it will not be removed even the linked container removed, can share among different containers
  3. Bind Mount
    Can bind any folder/file in host machine to the container. As this is the folder/ files in the host machine, the volume exist unless we delete those folder / files in the host machine explicitly.

Docker networking

By default, docker would assign IP for each container, you can know the IP of the container by “inspect” command

  1. Connect to web
    Just provide IP or hostname and port no, same as what you did in host machine
  2. Connect to host
    By using the special domain name — “host.docker.internal”, then docker will do the rest for you
  3. Connect to another docker
    Of coz we can use the IP address of each container, but for most situation, we don’t want to hard code the IP address.
    We can create a “docker network” and assign our containers to the “docker network”. Within the docker network, container can communicate with each other by container name

Sample Docker file

FROM java:8
#Try to retrieve the Java 8 base image from local, if not found in #local, it will retrieve Java 8 base image from docker hub
COPY . /app/java
#Copy the files in folder (in host machine) to /app/java in the #container file system. "." means the current folder, the folder #contains this dockerfile
WORKDIR /app/java
#Change the work folder(in the container) for the following commands
RUN javac Hello.java
#execute java compile command, and the result will be including in #the image, it is part of the process in building the image
CMD ["java", "Hello"]
#default command of the container, can be override when we run the #container

About the caching mechanism

As mentioned, each line in dockerfile will be a new layer and supported with cache which speed up the image rebuild process. However, if any line edited, that line and the succeeding line will not support by cache during rebuild.
For example, if “WORKDIR /app/java” updated to “WORKDIR /exe/java”
then “WORKDIR /exe/java”,
“RUN javac Hello.java”,
“CMD [“java”, “Hello”]”
will rebuild without cache support, that means it would take longer time.

Docker command cheat sheet

https://www.docker.com/sites/default/files/d8/2019-09/docker-cheat-sheet.pdf

--

--