How I Isolated Practice Terminal

Updated: Aug 22, 2022

Practice Terminal

The primary reason to provide Practice Terminal in MeeTTY is to make newcomers to Linux practice the commands shown by the Presenter during meeting without worrying about having Linux installed in their system. But as we all know, giving a shell in web is the stupidest way to ruin your server. So, the Shell we provide in the Practice Terminal should be isolated as much as possible. In Linux, we have user namespace [2] to isolate process without using root permission, cgroup [3] to control few aspects of the isolated process and fuse-overlayfs [4] to create overlayed root filesystem without root permission.

User Namespace

There are various tools which uses namespaces [5] to run containers like runc [6] which is used by docker [7] and podman [8], lxc [9] which is used by lxd [10], systemd-nspawn [11] which is used by machinectl [12] etc. These container tools focus heavily to provide better way to run containers [1], but I needed a simple tool to work like the traditional chroot [13] without root permission to run Practice Terminal shell. The nearest tool I found was unshare [14]. There are others like bwrap [15] used by flatpak [16], firejail [17] which provides simple way to run process in isolated environment using namespaces.

Before we do chroot, we need to manually mount pseudo filesystems like proc [18], devtmpfs [19], sysfs [20] etc. For unshare too we have to do this manual mount step before running unshare. I don’t want to do this manual mount, it should be automatically done, so I looked for a tool which does this automatically, the nearest tool I found is arch-chroot [21] which does mount step automatically, both bwrap and firejail can do the mount step but we have to specify which pseudo filesystems to mount each time.

I would have used arch-chroot, but it requires root privilege, I’m not willing to run anything with root privilege. Also, I need the sandboxed root filesystem to be used as a base to create identical instances of root filesystem for the Practice Terminal shell, this way, the changes made through Practice Terminal will not affect the actual sandboxed root filesystem, this involves overlayfs [22] filesystem, but overlayfs needs root privilege, thankfully fuse-overlayfs is there to rescue me. So, I ended up writing my own guest [23] commandline tool which uses fuse-overlayfs to implement my own isolation mechanism. I also wrote limit [24] commandline tool to manage resources for Practice Terminal through cgroup.

The guest commandline tool

This guest tool was heavily inspired by unshare because of its simplicity. It uses fuse-overlayfs to create a root filesystem then automatically mounts proc, devtmpfs and sysfs pseudo filesystems, then isolates itself and creates the shell which appears in Practice Terminal. It also communicates with limit process through /run/meetty.sock unix socket [25] to initialize limits for the Practice Terminal shell.

The limit commandline tool

This limit tool runs as root and starts listening on /run/meetty.sock unix socket. When guest commandline tool starts, it will provide its pid to the limit process. Once limit gets the pid, it will create a new network namespace [26], add this pid to this namespace, creates veth [27] pair and attach one end of the veth to this new network namespace, then attach other end with a bridge [28] in the default network namespace. It also limits maximum number of process in the Practice Terminal to 50, It also limits maximum memory Practice Terminal can use to 100 MB, It also limits maximum number of open file descriptors to 1024. All these limits are achieved through cgroup and rlimit [29].

This limit command line also disables packet routing in Practice Terminal if there is no [30] process running in the host.

writing guest and limit tools teached me lot of things about Linux Namespaces and Cgroups

To be continued..