Let’s deploy our own peer-to-peer Dropbox-like system with Syncthing, Nginx, and Kubernetes.
Goal
The goal here is to have folders synchronized between multiple computers and phones. I’m not going to try to explain why this is useful to have, but I will say that the problem gets harder when phones are involved. For computers, we can easily copy files around with scp
or rsync
, or we can use a network file system. Unfortunately, these things are painful to do on phones (just try typing an rsync
invocation in Termux making sure that you have the right number of slashes in paths). Fortunately, Syncthing exists and handles all the hard problems like service discovery, and copying files over unreliable networks.
Additionally, we want to serve the files over HTTPS. Some of them will be public, and some will require a password to access. This is useful when we want to get files onto machines that aren’t part of our Syncthing cluster. For instance, we’d need this if we wanted to download a PDF to an untrusted machine with a printer, or if we wanted to download a script to a cloud server accessible only via a web console.
Architecture
The full setup looks like this:
It looks a bit complicated, but it’s really just one pod with three containers, exposing three services, and depending on multiple ConfigMaps
and PVCs.
Syncthing pod
The core of the system is the one pod. We define it as part of a StatefulSet
because it binds a couple of PersistentVolumeClaims.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: syncthing
spec:
selector:
matchLabels:
app: syncthing
serviceName: syncthing
replicas: 1
template:
metadata:
labels:
app: syncthing
spec:
terminationGracePeriodSeconds: 60
The pod has three containers:
-
create-dirs
is aninitContainer
that ensures that the Syncthing shared directories exist, and that they’re owned by the right user. To avoid writing complicated shell scripts in YAML, we store the init script in aConfigMap
. -
syncthing
is the actual Syncthing component. There doesn’t seem to be an official container image, so we use the one provided by linuxserver.io. This container needs persistent volumes for its config, and for the shared folders. It exposes an admin web UI on port 8384, and the synchronization protocol on port 32222 (this is 22000 by default, but we’ll change it later). -
nginx
is the webserver. The configuration is stored in aConfigMap
. It has access to the Syncthing data persistent volume because it serves the files there. Since we password-protect the private folder, we provide the container with ahtpasswd
file stored in aSecret
. This container exposes port 80.
initContainers:
- name: create-dirs
image: registry.hub.docker.com/library/busybox:1.34.1
command: ['sh', '/init-scripts/init.sh']
volumeMounts:
- name: data
mountPath: /data
- name: init-scripts
mountPath: /init-scripts/
containers:
- name: syncthing
image: lscr.io/linuxserver/syncthing:1.18.5
ports:
- name: web-ui
containerPort: 8384
- name: syncthing-tcp
containerPort: 32222
protocol: TCP
- name: syncthing-udp
containerPort: 32222
protocol: UDP
volumeMounts:
- name: syncthing-config
mountPath: /config
- name: data
mountPath: /data
- name: nginx
image: registry.hub.docker.com/library/nginx:1.21.4
ports:
- containerPort: 80
name: http
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/conf.d
- name: nginx-htpasswd
mountPath: /etc/nginx/secret
- name: data
mountPath: /data
All that’s left is to list the volumes. There’s nothing special here, so it’s just boilerplate.
volumes:
- name: init-scripts
configMap:
name: init-scripts-config
- name: syncthing-config
persistentVolumeClaim:
claimName: syncthing-config-pv-claim
- name: data
persistentVolumeClaim:
claimName: syncthing-data-pv-claim
- name: nginx-config
configMap:
name: nginx-config
- name: nginx-htpasswd
secret:
secretName: nginx-htpasswd
We could have used fewer volumes: Syncthing would work fine with one volume for both configs and data. Nginx could store the htpasswd
, and the config in the same ConfigMap
. We could inline the script for the create-dirs
container. That said, it’s easy to separate the concerns here, so we might as well do so.
Persistent volumes
We need two persistent volumes for this setup, one for Syncthing’s configs, and one for the shared folders.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: syncthing-config-pv-claim
spec:
storageClassName: longhorn
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1G
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: syncthing-data-pv-claim
spec:
storageClassName: longhorn
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10G
I think both data and configs would be annoying to replace, so I store them on replicated Longhorn volumes. That said, the data is already replicated to other Syncthing peers, and the configs are rebuildable, so we could just store these on a single node with HostPath or zfs-localpv volumes.
Configs
There are three configs to mention. The first is the script for the create-dirs
container. This just ensures that the shared directories exist, and that they’re owned by Syncthing. The latter is run under the abc
user defined in the docker-baseimage-alpine
image from linuxserver.io.
apiVersion: v1
kind: ConfigMap
metadata:
name: init-scripts-config
data:
init.sh: |
#!/bin/sh
echo "Running init container"
mkdir -p /data/pub /data/priv
# Change directories to syncthing user (which is called "abc")
chown -R 911:1000 /data/pub /data/priv
echo "Init container done"
create-dirs
Next we have the configuration for the nginx
container. The interesting things here are the three location
Nginx directives. The first hides all dot files such as Syncthing’s .stfolder
. The second serves the /data/pub
directory. The autoindex
directive makes Nginx serve a directory listing as the index of folders. The third location
directive is for the /data/priv
private files. For these, we enable HTTP Basic Auth, and configure the htpasswd
file mounted from the Secret
defined next.
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
data:
default.conf: |
server {
listen 80 default_server;
listen [::]:80 default_server ipv6only=on;
root /data;
index index.html;
server_name localhost;
location ~ (^|/)\. {
return 403;
}
location /pub {
autoindex on;
try_files $uri $uri/ =404;
}
location /priv {
autoindex on;
auth_basic "Private Area";
auth_basic_user_file /etc/nginx/secret/htpasswd;
try_files $uri $uri/ =404;
}
}
---
apiVersion: v1
kind: Secret
metadata:
name: nginx-htpasswd
type: Opaque
stringData:
htpasswd: |
scvalex:PASSWORD-STRING-GENERATED-WITH-htpasswd-TOOL
---
All that’s left to do is expose the container ports as services, and then do some manual configuration on Syncthing.
Services and ingresses
We have three services to expose: the Nginx public webserver, the Syncthing admin web UI, and the Syncthing protocol.
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
ports:
- port: 80
name: http
selector:
app: syncthing
---
apiVersion: v1
kind: Service
metadata:
name: web-ui
spec:
ports:
- port: 8384
name: web-ui
selector:
app: syncthing
There’s not much to say about the nginx
and web-ui
services. They expose ports, and that’s it.
The more interesting one is the syncthing-protocol
service. According to the docs, Syncthing listens on 22000 for incoming connections. In Kubernetes, we can just open a port on the node with a NodePort
service. However, in my cluster, NodePorts
are only allowed >32000, so we expose 32222 instead, and we’ll change Syncthing’s config later.
apiVersion: v1
kind: Service
metadata:
name: syncthing-protocol
spec:
type: NodePort
ports:
- name: syncthing-tcp
port: 32222
protocol: TCP
nodePort: 32222
- name: syncthing-udp
port: 32222
protocol: UDP
nodePort: 32222
selector:
app: syncthing
Once the NodePorts
have been exposed in Kubernetes, we also need to open them up in the host firewall. For NixOS, this is as simple as adding the following to configuration.nix
:
networking.firewall.allowedUDPPorts = [
32222 # syncthing
];
networking.firewall.allowedTCPPorts = [
32222 # syncthing
];
Mind you, even without the exposing this port, Syncthing appears to work. The problems show up hours later when new devices have trouble connecting to the server. My suspicion is that whatever NAT punching Syncthing normally does works on Kubernetes too, but only for a time.
We also expose the nginx
server to the public Internet with an ingress. In my case, I’m using ingress-nginx
with cert-manager
for TLS certificates.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: dump
namespace: syncthing
annotations:
cert-manager.io/cluster-issuer: 'letsencrypt'
spec:
ingressClassName: nginx
tls:
- hosts:
- HOSTNAME
secretName: dump-certs
rules:
- host: HOSTNAME
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx
port:
number: 80
nginx-ingress.yaml
On purpose, we leave the web-ui
service unexposed to the public Internet. In my case, I have a separate ingress controller for internal services, and I create an ingress for it there. More simply, we can connect to the service with a command like this:
$ kubectl port-forward -n syncthing service/web-ui 8384:8384
Putting it all together
We put the entire deployment in its own namespace by listing all the files we’ve seen so far in a kustomization.yaml
. See my previous post for details on how this works.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: syncthing
resources:
- namespace.yaml
- deployment.yaml # all the above YAMLs in a single file
- nginx-ingress.yaml
kustomization.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: syncthing
namespace.yaml
Manual configuration
With all the infrastructure deployed, we still need to do a bit of manual configuration through the Syncthing web UI:
-
we need to add the
/data/pub
and/data/priv
folders to Syncthing, -
we need to add the IDs of any “Remote Devices” we intend to synchronize with,
-
we need to add the server’s ID to all the remote devices as well, and
-
we need to set the “Sync Protocol Listen Addresses” in the Settings ➞ Connections menu to this:
tcp://0.0.0.0:32222, quic://0.0.0.0:32222, dynamic+https://relays.syncthing.net/endpoint
We could write the configuration files by hand, and deploy them with Kubernetes, but that doesn’t seem worth the effort, especially since some options like Remote Devices are going to be fairly dynamic.
Conclusion
And that’s it. This is what it takes to setup a peer-to-peer Dropbox-like system with Kubernetes.
Looking back, it still surprises me how boilerplatey Kubernetes configuration can get, but at least we get a lot of bang for our buck. It would have been shorter to just deploy this to a single machine with something like NixOS, but then the service wouldn’t survive machine crashes.