Monday, June 12, 2017

Writing your own container in Go

This artice is about creating a tool which is like Docker, but much more simplified.

First of all, why do we need containers?

I like to think about containers as separate little planets who live in a universe. There is a planet for database management, other one is for an operating system, etc.. They don't care about each other, however you can create communication between them via e.g. radio waves. This could be  the interface between them.
In my world, what is good about these planets is that you can copy the exact same environment which it had, but who is going to live on these planets is up to that specific instance. (E.g. you can achieve that with quantum entanglements - this is getting a bit too fantasy-like haha.)
Let's get back to our containers. With these you can assure that if it works on your computer, it is going to work on any of them in the world - in theory. Saying that you can easily automate your processes for deployment, make development easier and more productive.
So - long story short - containers are really important for modern development environments.

Ok. How can I create one?

It's not that complex! Really!
First of all, I would like to shout out to Liz Rice, whose talk I saw at Craft Conference in Budapest few months ago. (similar video - twitter)
First, you have to understand two core features: control groups (cgroups) and namespaces.
So what are namespaces?
Well, in it's fundamental, namespace is what a user can see of it's environment: process id-s, file system, users, networking, hostname, etc.. And it's all yours, noone else's!
Ok now, let's see what cgroups are!
If we are sticking to the analogy before, cgroups are what you can use, like CPU, memory, disk I/O, etc.. So basically speaking about resources.
And now let's jump into coding!

As you can see it's not that big of a source code. Only 56 lines!
The basic idea is that we are going to run a system call inside a system call.  At the main function it is going to jump into the run function since first we passed the "run" parameter. (go run main.go run)
Basically what os.exec does is wrapping external commands so they can run in their own namespace. Uhm, excuse me? I have heard about this word before.
Well, no surprise, with this line of code we are going to achieve our own namespace, and we are going to have our own process id-s. Great!
You can see here that we are passing the new argument "child" which means it is going to call the child function.
The next interesting part is the 33rd line where we are passing a bunch of flags for the newly started process, these are:

  • CLONE_NEWUTS is for namespace
  • CLONE_NEWPID is for new process id-s
  • CLONE_NEWNS - this means unshare the mount namespace, so that the calling process has a private copy of its namespace which is not shared with any other process.
Neat! So now we now how we want to run our child process which has it's own namespace and cgroup, but it's still not working in it's own "world". We have to give it a filesystem which behaves like an ubuntu filesystem would work.
So that's where one another trick comes in: we have to have a linux root filesystem which will be the playground of our container. You can learn about the restrictions for the root filesystem here: http://www.tldp.org/LDP/sag/html/root-fs.html .
With this in our toolbelt we can change the root of our child execution to that particular root filesystem directory (46th line), and with the 48th line we are mounting /proc because it's a special kind of directory, and THAT'S IT, if you run go main.go run /bin/bash you will have your own little planet which is completely separate (well, now you know, it's not that separate, but it's acting like it is ;) ).
Enjoy and feel free to play around if you like these kinds of little aspects of modern software development / containerization!