How I Isolated Practice Terminal¶
The primary reason to provide Practice Terminal in MeeTTY is to make newcomers to Linux practice the commands shown by the Presenter during meeting without worrying about having Linux installed in their system. But as we all know, giving a shell in web is the stupidest way to ruin your server. So, the Shell we provide in the Practice Terminal should be isolated as much as possible. In Linux, we have user namespace 2 to isolate process without using root permission, cgroup 3 to control few aspects of the isolated process and fuse-overlayfs 4 to create overlayed root filesystem without root permission.
There are various tools which uses namespaces 5 to run containers like runc 6 which is used by docker 7 and podman 8, lxc 9 which is used by lxd 10, systemd-nspawn 11 which is used by machinectl 12 etc. These container tools focus heavily to provide better way to run containers 1, but I needed a simple tool to work like the traditional chroot 13 without root permission to run Practice Terminal shell. The nearest tool I found was unshare 14. There are others like bwrap 15 used by flatpak 16, firejail 17 which provides simple way to run process in isolated environment using namespaces.
Before we do chroot, we need to manually mount pseudo filesystems like proc 18, devtmpfs 19, sysfs 20 etc. For unshare too we have to do this manual mount step before running unshare. I don’t want to do this manual mount, it should be automatically done, so I looked for a tool which does this automatically, the nearest tool I found is arch-chroot 21 which does mount step automatically, both bwrap and firejail can do the mount step but we have to specify which pseudo filesystems to mount each time.
I would have used arch-chroot, but it requires root privilege, I’m not willing to run anything with root privilege. Also, I need the sandboxed root filesystem to be used as a base to create identical instances of root filesystem for the Practice Terminal shell, this way, the changes made through Practice Terminal will not affect the actual sandboxed root filesystem, this involves overlayfs 22 filesystem, but overlayfs needs root privilege, thankfully fuse-overlayfs is there to rescue me. So, I ended up writing my own guest 23 commandline tool which uses fuse-overlayfs to implement my own isolation mechanism. I also wrote limit 24 commandline tool to manage resources for Practice Terminal through cgroup.
The guest commandline tool¶
This guest tool was heavily inspired by unshare because of its simplicity. It uses fuse-overlayfs to create a root filesystem then automatically mounts proc, devtmpfs and sysfs pseudo filesystems, then isolates itself and creates the shell which appears in Practice Terminal. It also communicates with limit process through
/run/meetty.sock unix socket 25 to initialize limits for the Practice Terminal shell.
The limit commandline tool¶
This limit tool runs as root and starts listening on
/run/meetty.sock unix socket. When guest commandline tool starts, it will provide its
pid to the limit process. Once limit gets the
pid, it will create a new network namespace 26, add this
pid to this namespace, creates veth 27 pair and attach one end of the veth to this new network namespace, then attach other end with a bridge 28 in the default network namespace. It also limits maximum number of process in the Practice Terminal to 50, It also limits maximum memory Practice Terminal can use to 100 MB, It also limits maximum number of open file descriptors to 1024. All these limits are achieved through cgroup and rlimit 29.
writing guest and limit tools teached me lot of things about Linux Namespaces and Cgroups
To be continued..