FreedomBoxBlog

Program Space

Rob van der Hoeven
Sun Dec 29 2013

This is going to be a long read so let me give a quick overview of what you can expect. Lets start with a definition:

Program Space: A lightweight virtual environment for programs designed to restrict the programs running inside the Program Space to their own data and configuration, and to restrict configuration options and runtime behavior to values/behavior that the user finds acceptable.

The definition above is all about what a Program Space should be. It says nothing about how to create and use a Program Space. Thats what this article is about. In this article I will introduce you to a program of mine called psc (Program Space Control), and I will show you how to use this program to create and manage the virtual environment mentioned in the definition.

As you will learn, working with Program Spaces is easy. In order to create a Program Space named “test” you can simply type:

psc test --create

You can run a program inside this Program Space by typing:

psc test --run [program]

Examples:

psc test --run bash
psc test --run ifconfig
psc test --run top

Psc lets you create and control a virtual environment but does not configure this environment. Configuration is done with scripts that contain a combination of psc commands and ordinary commands like mount, ip, iptables etc. Its the configuration that restricts the programs running inside the Program Space. So if you would type:

psc test --create
psc test --run pstree

You would see no difference with typing the pstree command directly (you would see a tree containing dozens of programs). If however you would run a script to setup a rootfs like:

psc test --create
ps_rootfs test
psc test --run pstree

Then the output would be:

psd───pst───pstree

Apart from the pstree program only two other programs seem to be running, all the other programs running on the system are invisible to the pstree program.

The script ps_rootfs mentioned above is part of the Program Space Construction Set (PSCS) which also contains the source code of the psc program. The PSCS contains a number of sample configuration scripts that will show you how a Program Space can be configured. Besides the rootfs script it contains a network script and scrips to run a LAMP stack (Wordpress Blog) inside Program Space.

I hope this short introduction is enough to make you want to read the long explanation :-). Further on in this article I will explain the technology behind Program Spaces: how the psc program works and what commands it has. To demonstrate how the psc program can be used to create and configure a Program Space I wrote a number of scripts which will be explained in the PSCS section.

Status

At this point I have to say something about the status of my Program Space project: Its experimental. The psc program and scripts have been tested on my Debian Wheezy systems alone and are not intended to be used by inexperienced users or in a production environment.

If you are an experienced developer or sysadmin I would love to hear from you. Bug reports, ideas for improvements, or contributions to the PSCS are welcome!

Technology

Inside the Linux kernel all the important resources are capable of having multiple independent configurations. This technology is called kernel namespaces. Programs running in User Space use one of the possible configurations of a resource and are referred to as “running inside a namespace”.

Named after the resource there are several different namespaces (configurations) that can be created:

In order to create a Program Space the psc program creates a small daemon named psd. This daemon starts with its own PID and UTS namespaces, and can later on have private MNT, NET and IPC namespaces. Programs running as children of this daemon inherit the namespaces of the daemon and are restricted by the configuration of the inherited namespaces.

Running programs inside Program Space involves running them as children of the psd daemon. How does that work?

First the daemon is created:

psc test --create

Then send a command to the daemon:

psc test --run pstree

The daemon process (psd) acts much like the ssh daemon, but without the encryption and login part. When the daemon receives the command to run a program it creates a terminal program (pst) which takes care of running the program. To illustrate the connections:

             User Space                           Program Space

         --> STDIN                                    STDIN  -->
terminal <-- STDOUT psc <-- socket connection --> pst STDOUT <-- pstree
         <-- STDERR                                   STDERR <--

Hope this gives you some idea about how things work.

The psc program (link to sourcecode)

Psc is a small C program which must be compiled first:

gcc -Wall -o psc psc.c

In order to control a Program Space the psc program has a very small set of commands with the following format:

psc [program space name] [psc command] [psc command parameters]

The first parameter is the name of the Program Space. This name must not exceed 60 characters and may not contain spaces. The following commands can be used:

psc [name] --create [logfilename]

This command creates a new Program Space daemon and writes the PID of the daemon to STDOUT, a PID of 0 (zero) indicates an error. A newly created daemon starts with private PID and UTS namespaces. In order to see the effect of the private PID namespace the /proc directory must be re-mounted, this is typically done at a later time with the --chrootfs command. The UTS namespace is used to change the host name of the Program Space to ps_[name]. The logfilename parameter is optional, specify a full path if you want a logfile to be created. Creating a new Program Space requires root permissions.

psc [name] --kill

Kills the Program Space by sending SIGTERM to its daemon. This command requires root permissions.

psc [name] --net

Creates a new network namespace. The namespace starts with no network devices (not even lo), you have to create and configure all network interfaces. This command can only be given one time per Program Space and only by the root user.

psc [name] --ipc

Creates a new IPC namespace. This command can only be given one time per Program Space and only by the root user.

psc [name] --chrootfs [path]

Creates a new MOUNT namespace and changes the rootfs of the Program Space daemon. After this command the working directory of the daemon is set to /. The directory specified by [path] must be a mount point (use mount --bind to create one). The command can only be given one time per Program Space and only by the root user.

psc [name] --cwd [path]

Changes the working directory of the Program Space daemon to the specified path. All new programs running in Program Space start with this working directory.

psc [name] --pid

Writes the PID of the Program Space daemon to STDOUT, a PID of zero indicates an error.

psc [name] --run [program path] [program params]

Runs the specified program inside the Program Space. The user account used for running the program is equal to the account that invoked the psc command. The exit code of the psc command is 1 if psc encountered an internal error, or the exit code of the specified program.

PSCS - Program Space Construction Set

Program Spaces are configured by using a mix of psc commands and ordinary system commands. To illustrate this I have created a number of example scripts. Together with the pcs program these scrips form the PSCS which you can download at the downloads page.

The PSCS contains the following files:

Installation of the PSCS files can best be done in a directory that is in the PATH environment variable. A good place would be /usr/local/bin

Conventions and common code

Before discussing the details of each script its important to pay some attention to the things they have in common.

First, all scripts contain two types of variables: variables that refer to a Program Space, and variables that do not. Without proper naming things can become very confusing. I use the term User Space for everything that does not refer to a Program Space. Variables referring User Space always have a us_ prefix. For Program Space variables I use the ps_ prefix.

Second, its important to execute a script in the proper environment. A script that is designed to be run from User Space must not be able to run inside a Program Space. All scripts start with code to test the environment of the script.

ps_rootfs (link to sourcecode)

The purpose of a rootfs script is to restrict the programs running inside a Program Space. Ideally these programs should only have access to their own data, and only have access to the shared libraries that are necessary to run the programs. Every change a program running inside a Program Space makes to its rootfs should not affect the User Space rootfs. For example, if a program running inside a Program Space installs a specific version of a shared library then that version should not be available to programs running in User Space.

The ps_rootfs script that comes with the PSCS is designed for a wide range of programs. This means that you can always come up with a better rootfs for a specific group of programs. It creates a rootfs with the following properties:

Directories from the User Space rootfs are mounted read-only inside the Program Space rootfs. It is not possible for a program running inside a Program Space to change the User Space rootfs. A program running inside a Program Space can make modifications (create/delete files, chown etc) to the mounted directories coming from User Space, these modifications end up in a data-directory which is private to the Program Space.

To perform this magic the script makes use of AUFS, a union file system. The best way to illustrate how this works is by an example - In order to mount the /var directory from User Space the script executes the following mount command:

mount -t aufs -o br=/programspace/test/data/var=rw:/var=ro none /programspace/test/rootfs/var

where:

/var                            is mounted read-only
/programspace/test/data/var     contains changes made in Program Space (not visible)
/programspace/test/rootfs/var   the directory a program in Program Space sees (read-write)

In this example “test” is the name of the Program Space.

As you can see its pretty simple. The script mounts a number of User Space directories the same way. Then it creates special directories like /dev inside the Program Space rootfs and creates some necessary device nodes. Once the rootfs is fully configured it changes the rootfs of the Program space by executing:

psc test --chrootfs /programspace/test/rootfs

One of the interesting properties of the generated rootfs is that all filesystem actions end up in either the data or the rootfs directory of the Program Space. Creating a full backup of a Program Space is therefore easy: just tar the directory that contains the data and rootfs directories of the Program Space (note: the ps_backup script from the PSCS does this).

ps_backup (link to sourcecode)

If you use the ps_rootfs script to create a rootfs for a Program Space then all data and configuration files of that Program Space are stored in a directory named /programspace/[name]. The ps_backup script makes a nice tar archive of this directory. Restoring the backup is as easy as tar xf. (must be done as root)

In most cases the data inside the backup does not depend on the computer that created the backup. So if you have another system with the same software installed you can simply move or copy the Program Space to that system by unpacking the backup. (I had no problems moving the LAMP Program Space data from a 32bit system to a 64bit system)

ps_network (link to sourcecode)

The ps_network script uses DHCP to automatically configure the network of a Program Space. In order to do this, your system must have a bridged network available (the script expect a bridged network with the name br0).

If you do not have a bridged network, then you can create one by the following steps (Debian Wheezy):

apt-get install bridge-utils

modify /etc/interfaces so that it contains:

auto lo
iface lo inet loopback

auto br0
    iface br0 inet dhcp
    bridge_ports eth0
    bridge_fd 0

restart the network:

/etc/init.d/networking restart

Before you run the ps_network script you must change the rootfs by running ps_rootfs. This is because the dhclient program that runs inside the Program Space needs some space under /var to store information about its current leases. Without the rootfs this info would end up in the User Space rootfs and could cause problems there.

ps_lamp_create (link to sourcecode)

This script creates a ready-to-run WordPress Program Space. The purpose of the script is to demonstrate how the pcs program and the scripts mentioned earlier can be used together, it is probably not the best example of how to configure a LAMP stack ;-)

Unfortunately it is not possible to create a script that is independent of the Linux distribution, this demo is written for Debian Wheezy only.

ps_lamp (link to sourcecode)

The second part of the LAMP demo is a small script to start and stop the WordPress Program Space. Once the configuration is done, starting and stopping a Program Space is very easy. In order to start the lamp_demo Program Space the script runs the following commands:

    (note: checks and error handling removed, click sourcecode link for actual code)

    psc lamp_demo --create

    ps_rootfs lamp_demo
    ps_network lamp_demo
    ps_httpd_firewall lamp_demo

    psc lamp_demo --run /etc/init.d/mysql start
    psc lamp_demo --run /etc/init.d/apache2 start

As you can see, its simple! If you issue a pstree command inside the Program Space the result is:

psc lamp_demo --run pstree

psd─┬─apache2───10*[apache2]
    ├─dhclient
    ├─mysqld_safe───mysqld───17*[{mysqld}]
    └─pst───pstree

Final words

This has been a long article, and it is nice you are still reading :-) I hope you like this Program Space idea and will give it a try. Please feel free to contact me if you have questions or ideas for improvements.