1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
|
---
layout: post
title: 'DN42: Put it in a box (Linux network namespace)'
date: 2025-02-02 17:10 +0100
lang: "en"
categories: "tech"
---
What did I do the previous week? What robbed me of my sleep at night, keeping my
thoughts churning? Let's look at "doing containers the hard way".
For some time, I had CPU usage problems on my VPS. Looking at `htop`, I always
saw two bird instances and dnsmasq as "offenders". (I run both my clearnet AS as
well as my DN42 AS on this box) [^1]. I was originally going to write a blog article
about that, but it's still in my drafts folder. Here's the gist:
- All these processes receive "netlink messages".
- Probably the IBGP session for my AS generates lots of these netlink messages.
- The other bird process (for DN42) and dnsmasq suffer from it.
One possible solution is to put the processes into a container (Docker, Podman,
LXC, LXD, Incus, …). However, I'm on a budget. I have only 40 GB of storage on
my Hetzner VPS and don't want to waste it with duplicate file systems. However, I
can use the *technology* these container (runtimes/engines) use: Linux
namespaces.
I won't cover the basics here. There are other blog articles from other people
who are *far better* at explaining that (e.g. from
[anracon](https://linux-blog.anracom.com/2017/10/30/fun-with-veth-devices-linux-bridges-and-vlans-in-unnamed-linux-network-namespaces-i/)).
Suffice to say, I can put DN42 in a box, which put the network in isolation. To
a degree. I still need to communicate with the outside world. I dumped my setup
[in a git repo](https://git.uvok.de/ansible/tree/roles/linux-ns/files).
[jamesits](https://github.com/Jamesits/systemd-named-netns) provides some nice
templates for setting up a network namespace. I used that template, but tweaked
some parts to they suit me better [^2]. Also, using the Debian systemd files, I
manually created new ones and made sure the processes use the separate namespace.
I admit to never having particularly liked systemd, but at least the service
manager is pretty nice.
I also felt pretty clever coming up with [this setup
script](https://git.uvok.de/ansible/tree/roles/linux-ns/files/usrlocalbin/dn42-route-namespace.sh)
which gets called by systemd and then calls itself again, but in the newly created
namespace.
Inside the namespace run:
- The Wireguard interfaces to my peers,
- Tinc (which I only have because I want to have a broadcast-capable VPN),
- PowerDNS (a separate instance from "the outer VPS one", serving my DN42 domain),
- BIRD2,
- And a looking glass
*Outside* the namespace run:
- dnsmasq, which allows me to resolve my DN42 domains "from the clearnet"
(from within a Wireguard net)
- Nginx, which serves my DN42 website
It took me a while and some internet searches to come up with the firewall
rules. On my VPS itself I use ufw, for the network namespace, I *could probably*
make this work as well, but I decided to use "iptables", or rather, the wrapper
scripts which provide the same syntax but use nftables in the background. This
is because with this namespace separation, I don't have to worry about
"cross-leaking" network traffic between DN42 and the clearnet. So the rules get
a lot simpler
I was happy that both DNS and the website work, when I realized that the default
policy for FORWARD was ALLOW. I found it suitable to set it to DROP, and
suddenly, it didn't work anymore, because I was employing destination NAT:
```
*nat
-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 80 -j DNAT --to-destination fcee::1
-A PREROUTING -d fd3e:bc05:2d6::80/128 -p tcp --dport 443 -j DNAT --to-destination fcee::1
```
This is because the webserver is running on the "outer" VPS, not in the network
namespace (hey, it took me long enough to finish this up as it is). This is
probably a relatively small adjustment.
With an additional forward rule, everything is happy again:
```
*filter
-A FORWARD -s fd00::/8 -d fcee::1/128 -j ACCEPT
-A FORWARD -s fcee::1/128 -d fd00::/8 -j ACCEPT
```
Also, I also feel pretty clever for making sure I can access DN42 from my
clearnet:
```
*mangle
-A PREROUTING -i eth0 -j MARK --set-mark 0x4242
COMMIT
*nat
-A POSTROUTING -d fd00::/8 -m mark --mark 0x4242 -j MASQUERADE
COMMIT
```
I am actually not sure if the latter needs an "-i (all wg interfaces)" or "! -i
(internal interfaces)" to avoid unnecessary NATting for when I access a service
in the namespace. But nevertheless, I achieved a working state. Best of all, I
can reuse some of the work if I ever want to put my clearnet AS in a namespace,
too.
[^1]:
An AS is in ["autonomous system"]({% post_url
2023-08-18-networking-adventure-my-own-ipv6-prefix-and-as %}), [DN42]({% link
dn42.md %}) is a "lab environment" for networking.
[^2]:
For example, I have no idea why they unmount and mount the netns path
manually. Probably this fixes some bug of an old systemd version?
|