Let’s write an nbd
service module for NixOS. We’ll look at the shape of NixOS modules, define the service’s options, generate the configuration for the server, write tests, and see how upstreaming into nixpkgs
works.
The full code for this post is in the nixpkgs#163009
pull request.
Network Block Devices (nbd
)
While not the subject of this post, we first need to learn what nbd
is, and how the server and client are used. To quote the nbd
homepage:
With this compiled into your kernel, Linux can use a remote server as one of its block devices. Every time the client computer wants to read /dev/nbd0, it will send a request to the server via TCP, which will reply with the data requested.
Usage wise, we first run nbd-server 0.0.0.0:10809 /dev/sda1
to expose the disk over the network. We then run nbd-client server.domain 10809 /dev/nbd0
to connect. Any reads and writes on the client to /dev/nbd0
will be forwarded to /dev/sda1
on the server.
Network Block Devices let us attach any disk to any machine without having to muck about with network file systems. This is particularly useful if we want access to the raw disk instead of the filesystem API. For instance, I use this as part of my Secure Remote Disk setup.
While calling nbd-server
manually works, it would be much cleaner if it were wrapped in a systemd service, and for that, we need to write a NixOS module.
NixOS modules
NixOS configuration is split across modules, many of which live in nixpkgs
. The main feature these bring is composability. The configuration generated by all the modules is ultimately merged into the single system configuration for a machine. Modules are what allow us to assign to environment.systemPackages
in multiple places, and have all the lists automatically concatenated.
The NixOS manual has a section on writing modules, and it’s an excellent reference. We’ll cover similar content in this section, but in a more opinionated way, focusing on modules for system services. Without further ado, the basic shape of a module is:
{ config, lib, pkgs, ... }:
with lib;
let
cfg = config.services.my-service;
port = 12345;
in
{
imports = [
./other-module.nix
];
options = {
services.my-service = {
enable = mkEnableOption "my service";
option1 = mkOption {
type = types.str;
description = "An option";
};
};
};
config = mkIf cfg.server.enable {
systemd.services.my-service = {
script = ''
/bin/server start --option --port
'';
};
other.opts.go.here = true;
};
}
Taking it from the top, a NixOS module is actually a function that takes an environment, and returns an attribute set (attrset) with the imports
, options
, and config
fields. If the module doesn’t have options
, then the config
part can just be inlined into the top-level set, but that just looks ugly in my opinion. The function’s sole argument is:
{ config, lib, pkgs, ... }:
This is an attrset with information about the environment. Its fields are:
-
config
: an attrset with all the values of configuration options. It’s recursively defined, so it will also contain what we declare inoptions
. -
pkgs
: all the packages defined innixpkgs
. For instance,git
ispkgs.git
. -
lib
: the big utility library that comes withnixpkgs
. Most modules end up needing something from here, so we import its contents to our namespace in the next line:with lib;
-
There are other fields in this attrset, but they’re not usually needed by user modules. Search
nixpkgs
forspecialArgs
and_module.args
to find them.
let
cfg = config.services.my-service;
port = 12345;
in
Next, we include some definitions in a let
. We could put this anywhere in our module, but the top-level is the least indented spot, so it usually looks nicer there. It’s also customary to have cfg
as an alias to the configuration of the current module.
imports = [
./other-module.nix
];
Next, we have a list of other modules to import. We only need to do this if the imported module isn’t already in module-list.nix
. This might be the case if we’re writing several related modules, and only add one of them to the global module list. Importing like this is different from calling import ./other-module.nix
. When using the imports
field, the arguments to the current module (i.e. config
, lib
, pkgs
) will also be passed to the sub-module. When using the import
expression, we have to pass these arguments ourselves.
options = {
services.my-service = {
Next come the option declarations for our module. This is where we describe the possible configuration our module takes. We can use any hierarchy we want here, but we should try to be consistent with the existing options. Since we’re using services.my-service.*
here, we’ll later access the option values (not declarations) with config.service.my-service.*
(or the short-hand cfg.*
).
enable = mkEnableOption "my service";
By convention, services have an enable
option that controls whether they’re active or not. If inactive, the service module shouldn’t set any configuration.
option1 = mkOption {
type = types.str;
description = "An option";
};
Then, we have our options proper. The NixOS manual has a section on option declarations. For each one, we specify its type
, a human-readable description
, an optional default
value, and an optional example
. The type
is where the magic happens: it describes how different values for this option should be merged together. This is what lets multiple modules assign to environment.systemPackages
, and have all the lists concatenated in the final configuration.
config = mkIf cfg.server.enable {
Next, we write the actual configuration generated by this module. This will be merged with the configs
of all other modules. By convention, we guard this with the enable
option.
systemd.services.my-service = {
script = ''
/bin/server start --option --port
'';
};
other.opts.go.here = true;
Finally, we have the configuration proper. In this example, we add a systemd service, and set some other option. We access configuration values from paths prefixed with config
(remember that cfg = config.services.my-service
), but we assign to paths that do not have the prefix. So, all the assignments in this section look like some.option = config.other.option
.
In addition to system configuration, some special attributes also go in config
. Assertions can be used to express preconditions for using a module (e.g. cfg.port != null
). Warnings can be used to display messages to users of the module (e.g. “this feature is deprecated and removed in the future”).
nbd
options
Now that we’ve seen what a NixOS module looks like in general, let’s focus on the one for the nbd
service.
The full code for the module is here.
We start with the options:
options = {
services.nbd = {
server = {
enable = mkEnableOption "the Network Block Device (nbd) server";
First, we define the services.nbd.server.enable
option. This controls whether the server is enabled or not.
listenPort = mkOption {
type = types.port;
default = 10809;
description = "Port to listen on. The port is NOT automatically opened in the firewall.";
};
Next, we define an option for the port with a default value.
extraOptions = mkOption {
type = with types; attrsOf (oneOf [ bool int float str ]);
default = {
allowlist = false;
};
description = ''
Extra options for the server. See
<citerefentry><refentrytitle>nbd-server</refentrytitle>
<manvolnum>5</manvolnum></citerefentry>.
'';
};
Since we don’t want to manually encode every nbd-server
option into nix, we define a passthrough: extraOptions
is an attrset where the values are basic types. So, it’s something like { certfile = /path/to/cert; keyfile = /path/to/key; }
, where we have not defined certfile
and keyfile
anywhere in our module. We will write this into nbd-server
’s configuration file without any processing.
exports = mkOption {
description = "Files or block devices to make available over the network.";
default = { };
type = with types; attrsOf
(submodule {
Next, we define the exports. These are nested sections of configuration, so we make them an attrsOf submodule
. In usage, this will look like exports = { export1 = { ... }; export2 = { ... }; }
.
Export options: path
, allowAddresses
, and extraOptions
options = {
path = mkOption {
type = str;
description = "File or block device to export.";
example = "/dev/sdb1";
};
allowAddresses = mkOption {
type = nullOr (listOf str);
default = null;
example = [ "10.10.0.0/24" "127.0.0.1" ];
description = "IPs and subnets that are authorized to connect for this device. If not specified, the server will allow all connections.";
};
extraOptions = mkOption {
type = attrsOf (oneOf [ bool int float str ]);
default = {
flush = true;
fua = true;
};
description = ''
Extra options for this export. See
<citerefentry><refentrytitle>nbd-server</refentrytitle>
<manvolnum>5</manvolnum></citerefentry>.
'';
};
};
As for the export options themselves, it’s more of the same. The only new thing is the type
of allowAddresses
: nullOr (listOf str)
. This means the values could be any of the following: null
, []
, [ text1 text2 ]
. These will make their way into the authfile
option of nbd-server(5)
. We need a nullable option here because the server behaves differently if the authfile
is omitted, defined but points to an empty file, or defined and points to a file with addresses.
Option listenAddress
listenAddress = mkOption {
type = with types; nullOr str;
description = "Address to listen on. If not specified, the server will listen on all interfaces.";
default = null;
example = "10.10.0.1";
};
Finally, we have one more option for the address that the server will bind on. For both allowAddresses
and listenAddress
, we specified the base type as str
, even though it really should be a regular expression matching IPv4/IPv6 addresses and subnets as understood by nbd-server
. This is mostly laziness, but also because such a regular expression is likely to be wrong.
nbd
configuration
Now that we have the options defined in config.services.nbd
(cfg
), we need to generate the actual system configuration.
config = mkIf cfg.server.enable {
We start we the usual activation guard.
boot.kernelModules = [ "nbd" ];
Next, we load the nbd
kernel module. This is always required when using nbd-server
(although the server will try loading it itself if missing).
systemd.services.nbd-server = {
after = [ "network-online.target" ];
before = [ "multi-user.target" ];
wantedBy = [ "multi-user.target" ];
serviceConfig = {
ExecStart = " /bin/nbd-server -C ";
Type = "forking";
At its core, the systemd service is fairly simple. Because nbd-server
forks, it’s a forking
service. The startup command is just the binary followed by the configuration file (see below). The before
, after
, and wantedBy
lines say that the service should start after networking is available (network-online.target
), and that it should be started as part of normal startup (multi-user.target
). It gets a bit more complicated because we want to sandbox the server as much as possible:
Systemd service sandboxing options
DeviceAllow = map (path: " rw") allowedDevices;
BindPaths = boundPaths;
CapabilityBoundingSet = "";
DevicePolicy = "closed";
LockPersonality = true;
MemoryDenyWriteExecute = true;
NoNewPrivileges = true;
PrivateDevices = false;
PrivateMounts = true;
PrivateTmp = true;
PrivateUsers = true;
ProcSubset = "pid";
ProtectClock = true;
ProtectControlGroups = true;
ProtectHome = true;
ProtectHostname = true;
ProtectKernelLogs = true;
ProtectKernelModules = true;
ProtectKernelTunables = true;
ProtectProc = "noaccess";
ProtectSystem = "strict";
RestrictAddressFamilies = "AF_INET AF_INET6";
RestrictNamespaces = true;
RestrictRealtime = true;
RestrictSUIDSGID = true;
UMask = "0077";
These options are the result of running systemd-analyze security
on the service, and applying all the suggestions that didn’t break the server. They’re interesting in themselves, but aren’t relevant here.
All that’s left is to generate the ${serverConfig}
we used in the systemd service. The syntax for this file is documented in nbd-server(5)
. Ultimately, we want to generate an ini file like the following:
[generic]
allowlist=false
user=root
group=root
listenaddr=0.0.0.0
port=10809
[export1]
authfile=/path/to/authfile1
exportname=/dev/disk/by-id/scsi-0HC_Volume_17235401
flush=true
fua=true
[export2]
authfile=/path/to/authfile2
exportname=/path/to/some/regular/file
flush=true
fua=true
nbd-server
configuration file
We could create this with string templating, but nixpkgs
has a handy nix→ini conversion function.
configFormat = pkgs.formats.ini { };
serverConfig = configFormat.generate "nbd-server-config"
({
generic =
(cfg.server.extraOptions // {
user = "root";
group = "root";
port = cfg.server.listenPort;
} // (optionalAttrs (cfg.server.listenAddress != null) {
listenaddr = cfg.server.listenAddress;
}));
}
// (mapAttrs
(_: { path, allowAddresses, extraOptions }:
extraOptions // {
exportname = path;
} // (optionalAttrs (allowAddresses != null) {
authfile = pkgs.writeText "authfile" (concatStringsSep "\n" allowAddresses);
}))
cfg.server.exports)
);
The code looks messy, but it’s mostly because the attrset merging operator \\
is used a lot. The interesting bit here is that we’re creating multiple files at once. The configFormat.generate
call writes the ini file to the nix store and returns its path. The pkgs.writeText "authfile"
calls also create files in the nix store, one for each of the server’s exports.
NixOS tests
We have our options defined, and we can generate configuration from them. The next step is actually trying this out with a NixOS test. The basic shape of a test is this:
import ./make-test-python.nix ({ pkgs, ... }: {
name = "my-test";
nodes = {
machine1 = { config, pkgs, ... }: {
services.my-service.enable = true;
...
};
machine2 = { config, pkgs, ... }: {
...
};
};
testScript = ''
start_all()
machine1.succeed("echo this runs on machine1")
machine2.succeed("echo this runs on machine2")
...
'';
})
nixpkgs/nixos/tests/my-test.nix
: The general shape of a NixOS test
This test can then be run with nix-build nixos/tests/my-test.nix
. It creates VMs for machine1
and machine2
, and runs the python testScript
against them. The process is very smooth, and when developing a module, it beats having to redeploy to a live machine to try out changes.
The full code for the test is here.
Our nbd
test is going to setup two machines: server
and client
. The client
machine has nbd
installed, but otherwise it’s just vanilla NixOS. We’ll use it to run commands in the testScript
.
client = { config, pkgs, ... }: {
programs.nbd.enable = true;
};
client
machine
The server
has the more complicated setup:
-
As soon as it boots, we create two files to use as our
nbd
disks. One of these is a regular file, and we turn the second into aloop
device withlosetup(8)
, -
We open the server’s listen port in the firewall, and
-
We start
nbd-server
exposing the file and theloop
device. For the latter, we also configureallowAddresses
to check that permissioning works.
server = { config, pkgs, ... }: {
# Create some small files of zeros to use as the ndb disks
## `vault-pub.disk` is accessible from any IP
systemd.services.create-pub-file =
mkCreateSmallFileService { path = "/vault-pub.disk"; };
## `vault-priv.disk` is accessible only from localhost.
## It's also a loopback device to test exporting /dev/...
systemd.services.create-priv-file =
mkCreateSmallFileService { path = "/vault-priv.disk"; loop = true; };
# Needed only for nbd-client used in the tests.
environment.systemPackages = [ pkgs.nbd ];
# Open the nbd port in the firewall
networking.firewall.allowedTCPPorts = [ listenPort ];
# Run the nbd server and expose the small file created above
services.nbd.server = {
enable = true;
exports = {
vault-pub = {
path = "/vault-pub.disk";
};
vault-priv = {
path = "/dev/loop0";
allowAddresses = [ "127.0.0.1" "::1" ];
};
};
listenAddress = "0.0.0.0";
listenPort = listenPort;
};
};
server
machine
The test script is straightforward. We start the machines, and wait for the server to start listening on its port. Although this is a python script, we’re specifying it in a nix file, so we can use regular string interpolation to pass values in.
start_all()
server.wait_for_open_port(${toString listenPort})
We then connect with the client, write some data, disconnect, and check that the data was written to the backing file on the server.
# Client: Connect to the server, write a small string to the nbd disk, and cleanly disconnect
client.succeed("nbd-client server ${toString listenPort} /dev/nbd0 -name vault-pub -persist")
client.succeed(f"echo ')
client. ' | dd of=/dev/nbd0 conv=notrunc"succeed("nbd-client -d /dev/nbd0")
# Server: Check that the string written by the client is indeed in the file
foundString = server.succeed(f"dd status=none if=/vault-pub.disk count=)[: "len(testString)]
if foundString != testString:
raise Exception(f"Read the wrong string from nbd disk. Expected: ')
'. Found: ' '"# Client: Fail to connect to the private disk
client.fail("nbd-client server ${toString listenPort} /dev/nbd0 -name vault-priv -persist")
# Server: Successfully connect to the private disk
server.succeed("nbd-client localhost ${toString listenPort} /dev/nbd0 -name vault-priv -persist")
server.succeed(f"echo ')
foundString = server. ' | dd of=/dev/nbd0 conv=notrunc"succeed(f"dd status=none if=/dev/loop0 count=)[: "len(testString)]
if foundString != testString:
raise Exception(f"Read the wrong string from nbd disk. Expected: ')
server. '. Found: ' '"succeed("nbd-client -d /dev/nbd0")
We add the test to nixos/tests/all-tests.nix
, and also to passthru.tests
in the nbd
package. We can now run the test with any of the following invocations. It’s important to ensure that all these work because they’re used by the ofborg
build server in one situation or another.
$ nix-build nixos/tests/nbd.nix
$ nix-build -A nixosTests.nbd
$ nix-build -A nbd.passthru.tests.test
Miscellaneous tasks
We’re almost done with the code changes, but there are a few loose ends to tie up:
-
We want our module to be automatically available to anyone using
nixpkgs
, so we add it tonixos/modules/module-list.nix
. Without this, users would have to manually add our module to theirimports = [...]
. -
We want a second module for just
nbd
:programs.nbd.enable
. If enabled, this adds both thenbd
package and thenbd
kernel module to the system configuration. So, servers would useservices.nbd.server.enable
, and clients would useprograms.nbd.enable
. -
The option docs are generated automatically from the modules once they’ve been added to
module-list.nix
, but we have to update the release notes manually. After making the changes, we follow the instructions on how to build the manual.
PR process
We have the module, we have tests, and we even have docs. We just need to upstream everything into nixpkgs
. The code-related bits of the process are the usual Github pull request (PR) workflow: fork nixpkgs
, push our changes to a branch, and click the buttons in the UI to open a PR against master
.
The social bits of the process are finding reviewers for the changes, and finding somebody to merge the PR. If we don’t do this ourselves, the PR is likely to just never get looked at (at the time of writing, there are 3142 open PRs, mostly created by automated jobs).
For reviewers, I’ve had success with searching for other PRs to the same files, and asking the reviewers of those in a comment. Failing that, I’d try people who have previously edited the files. And failing that, I’d try asking in the #dev:nixos.org room in Matrix. The Matrix room has never failed me.
When it comes to the review itself, we should remember that none of the people involved do nixpkgs
as their jobs. So, it frequently happens that not every reviewer actually reviews every PR they’re assigned. It also happens that people are sometimes busy or forget to respond to comments. My rule of thumb is that I’m responsible for shepherding my PRs to completion, and that sometimes involves reminding people what’s pending on them every few days.
For merging, there are a few people who periodically go through PRs, and merge the ones that look ready. That said, I don’t know what rules they use to decide if a PR is good to merge or not. I’ve had success with asking in the Matrix room for somebody to merge.
After merging, if we wanted to backport our changes to the current stable NixOS release, we’d either have to redo the PR based on a branch like release-21.11
, or we could ask somebody to add the backport release-21.11
label to the original PR. This instructs ofborg
to cherry pick the changes, and open the backport PR (example backport PR).
Conclusion
So, that’s what it takes to write a NixOS service module, get it tested, and then have it merged. It’s not hard–it’s not rocket surgery after all. However, unless you already know all the places that need to be changed, it can be a bit daunting to get started. I hope this post sheds some light on the process.