diff options
-rw-r--r-- | _posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md | 116 |
1 files changed, 116 insertions, 0 deletions
diff --git a/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md b/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md new file mode 100644 index 0000000..b0b1efa --- /dev/null +++ b/_posts/2025-02-02-dn42-put-it-in-a-box-linux-network-namespace.md @@ -0,0 +1,116 @@ +--- +layout: post +title: 'DN42: Put it in a box (Linux network namespace)' +date: 2025-02-02 17:10 +0100 +--- + +What did I do the previous week? What robbed me of my sleep at night, keeping my +thoughts churning? Let's look at "doing containers the hard way". + +For some time, I had CPU usage problems on my VPS. Looking at `htop`, I always +saw two bird instances and dnsmasq as "offenders". (I run both my clearnet AS as +well as my DN42 AS on this box) [^1]. I was originally going to write a blog article +about that, but it's still in my drafts folder. Here's the gist: + +- All these processes receive "netlink messages". +- Probably the IBGP session for my AS generates lots of these netlink messages. +- The other bird process (for DN42) and dnsmasq suffer from it. + +One possible solution is to put the processes into a container (Docker, Podman, +LXC, LXD, Incus, …). However, I'm on a budget. I have only 40 GB of storage on +my Hetzner VPS and don't want to waste it with duplicate file systems. However, I +can use the *technology* these container (runtimes/engines) use: Linux +namespaces. + +I won't cover the basics here. There are other blog articles from other people +who are *far better* at explaining that (e.g. from +[anracon](https://linux-blog.anracom.com/2017/10/30/fun-with-veth-devices-linux-bridges-and-vlans-in-unnamed-linux-network-namespaces-i/)). + +Suffice to say, I can put DN42 in a box, which put the network in isolation. To +a degree. I still need to communicate with the outside world. I dumped my setup +[in a git repo](https://git.uvok.de/ansible/tree/roles/linux-ns/files). + +[jamesits](https://github.com/Jamesits/systemd-named-netns) provides some nice +templates for setting up a network namespace. I used that template, but tweaked +some parts to they suit me better [^2]. Also, using the Debian systemd files, I +manually created new ones and made sure the processes use the separate namespace. +I admit to never having particularly liked systemd, but at least the service +manager is pretty nice. + +I also felt pretty clever coming up with [this setup +script](https://git.uvok.de/ansible/tree/roles/linux-ns/files/usrlocalbin/dn42-route-namespace.sh) +which gets called by systemd and then calls itself again, but in the newly created +namespace. + +Inside the namespace run: + +- The Wireguard interfaces to my peers, +- Tinc (which I only have because I want to have a broadcast-capable VPN), +- PowerDNS (a separate instance from "the outer VPS one", serving my DN42 domain), +- BIRD2, +- And a looking glass + +*Outside* the namespace run: + +- dnsmasq, which allows me to resolve my DN42 domains "from the clearnet" + (from within a Wireguard net) +- Nginx, which serves my DN42 website + +It took me a while and some internet searches to come up with the firewall +rules. On my VPS itself I use ufw, for the network namespace, I *could probably* +make this work as well, but I decided to use "iptables", or rather, the wrapper +scripts which provide the same syntax but use nftables in the background. This +is because with this namespace separation, I don't have to worry about +"cross-leaking" network traffic between DN42 and the clearnet. So the rules get +a lot simpler + +I was happy that both DNS and the website work, when I realized that the default +policy for FORWARD was ALLOW. I found it suitable to set it to DROP, and +suddenly, it didn't work anymore, because I was employing destination NAT: + +``` +*nat +-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 80 -j DNAT --to-destination fcee::1 +-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 443 -j DNAT --to-destination fcee::1 +``` + +This is because the webserver is running on the "outer" VPS, not in the network +namespace (hey, it took me long enough to finish this up as it is). This is +probably a relatively small adjustment. + +With an additional forward rule, everything is happy again: + +``` +*filter +-A FORWARD -s fd00::/8 -d fcee::1/128 -j ACCEPT +-A FORWARD -s fcee::1/128 -d fd00::/8 -j ACCEPT +``` + +Also, I also feel pretty clever for making sure I can access DN42 from my +clearnet: + +``` +*mangle +-A PREROUTING -i eth0 -j MARK --set-mark 0x4242 +COMMIT + +*nat +-A POSTROUTING -d fd00::/8 -m mark --mark 0x4242 -j MASQUERADE +COMMIT +``` + +I am actually not sure if the latter needs an "-i (all wg interfaces)" or "! -i +(internal interfaces)" to avoid unnecessary NATting for when I access a service +in the namespace. But nevertheless, I achieved a working state. Best of all, I +can reuse some of the work if I ever want to put my clearnet AS in a namespace, +too. + +[^1]: + An AS is in ["autonomous system"]({% post_url + 2023-08-18-networking-adventure-my-own-ipv6-prefix-and-as %}), [DN42]({% link + dn42.md %}) is a "lab environment" for networking. + +[^2]: + For example, I have no idea why they unmount and mount the netns path + manually. Probably this fixes some bug of an old systemd version? + |