summaryrefslogtreecommitdiff
path: root/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md
diff options
context:
space:
mode:
Diffstat (limited to '_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md')
-rw-r--r--_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md116
1 files changed, 116 insertions, 0 deletions
diff --git a/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md b/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md
new file mode 100644
index 0000000..b0b1efa
--- /dev/null
+++ b/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md
@@ -0,0 +1,116 @@
+---
+layout: post
+title: 'DN42: Put it in a box (Linux network namespace)'
+date: 2025-02-02 17:10 +0100
+---
+
+What did I do the previous week? What robbed me of my sleep at night, keeping my
+thoughts churning? Let's look at "doing containers the hard way".
+
+For some time, I had CPU usage problems on my VPS. Looking at `htop`, I always
+saw two bird instances and dnsmasq as "offenders". (I run both my clearnet AS as
+well as my DN42 AS on this box) [^1]. I was originally going to write a blog article
+about that, but it's still in my drafts folder. Here's the gist:
+
+- All these processes receive "netlink messages".
+- Probably the IBGP session for my AS generates lots of these netlink messages.
+- The other bird process (for DN42) and dnsmasq suffer from it.
+
+One possible solution is to put the processes into a container (Docker, Podman,
+LXC, LXD, Incus, …). However, I'm on a budget. I have only 40 GB of storage on
+my Hetzner VPS and don't want to waste it with duplicate file systems. However, I
+can use the *technology* these container (runtimes/engines) use: Linux
+namespaces.
+
+I won't cover the basics here. There are other blog articles from other people
+who are *far better* at explaining that (e.g. from
+[anracon](https://linux-blog.anracom.com/2017/10/30/fun-with-veth-devices-linux-bridges-and-vlans-in-unnamed-linux-network-namespaces-i/)).
+
+Suffice to say, I can put DN42 in a box, which put the network in isolation. To
+a degree. I still need to communicate with the outside world. I dumped my setup
+[in a git repo](https://git.uvok.de/ansible/tree/roles/linux-ns/files).
+
+[jamesits](https://github.com/Jamesits/systemd-named-netns) provides some nice
+templates for setting up a network namespace. I used that template, but tweaked
+some parts to they suit me better [^2]. Also, using the Debian systemd files, I
+manually created new ones and made sure the processes use the separate namespace.
+I admit to never having particularly liked systemd, but at least the service
+manager is pretty nice.
+
+I also felt pretty clever coming up with [this setup
+script](https://git.uvok.de/ansible/tree/roles/linux-ns/files/usrlocalbin/dn42-route-namespace.sh)
+which gets called by systemd and then calls itself again, but in the newly created
+namespace.
+
+Inside the namespace run:
+
+- The Wireguard interfaces to my peers,
+- Tinc (which I only have because I want to have a broadcast-capable VPN),
+- PowerDNS (a separate instance from "the outer VPS one", serving my DN42 domain),
+- BIRD2,
+- And a looking glass
+
+*Outside* the namespace run:
+
+- dnsmasq, which allows me to resolve my DN42 domains "from the clearnet"
+ (from within a Wireguard net)
+- Nginx, which serves my DN42 website
+
+It took me a while and some internet searches to come up with the firewall
+rules. On my VPS itself I use ufw, for the network namespace, I *could probably*
+make this work as well, but I decided to use "iptables", or rather, the wrapper
+scripts which provide the same syntax but use nftables in the background. This
+is because with this namespace separation, I don't have to worry about
+"cross-leaking" network traffic between DN42 and the clearnet. So the rules get
+a lot simpler
+
+I was happy that both DNS and the website work, when I realized that the default
+policy for FORWARD was ALLOW. I found it suitable to set it to DROP, and
+suddenly, it didn't work anymore, because I was employing destination NAT:
+
+```
+*nat
+-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 80 -j DNAT --to-destination fcee::1
+-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 443 -j DNAT --to-destination fcee::1
+```
+
+This is because the webserver is running on the "outer" VPS, not in the network
+namespace (hey, it took me long enough to finish this up as it is). This is
+probably a relatively small adjustment.
+
+With an additional forward rule, everything is happy again:
+
+```
+*filter
+-A FORWARD -s fd00::/8 -d fcee::1/128 -j ACCEPT
+-A FORWARD -s fcee::1/128 -d fd00::/8 -j ACCEPT
+```
+
+Also, I also feel pretty clever for making sure I can access DN42 from my
+clearnet:
+
+```
+*mangle
+-A PREROUTING -i eth0 -j MARK --set-mark 0x4242
+COMMIT
+
+*nat
+-A POSTROUTING -d fd00::/8 -m mark --mark 0x4242 -j MASQUERADE
+COMMIT
+```
+
+I am actually not sure if the latter needs an "-i (all wg interfaces)" or "! -i
+(internal interfaces)" to avoid unnecessary NATting for when I access a service
+in the namespace. But nevertheless, I achieved a working state. Best of all, I
+can reuse some of the work if I ever want to put my clearnet AS in a namespace,
+too.
+
+[^1]:
+ An AS is in ["autonomous system"]({% post_url
+ 2023-08-18-networking-adventure-my-own-ipv6-prefix-and-as %}), [DN42]({% link
+ dn42.md %}) is a "lab environment" for networking.
+
+[^2]:
+ For example, I have no idea why they unmount and mount the netns path
+ manually. Probably this fixes some bug of an old systemd version?
+