A dead WireGuard tunnel stalled every music torrent
Music container in my torrent setup was stuck. 349 torrents, all in metaDL, all at 0 peers, all 0 bytes. The container itself was running, qBittorrent's WebUI was responsive, everything looked alive — except no actual network was happening.
What was happening
The container binds qBittorrent exclusively to a wg0 interface. That's a WireGuard tunnel to a commercial VPN provider. If wg0 is up, torrent traffic goes through the VPN. If wg0 is down, qBittorrent has nothing to bind to and every operation that needs network silently fails.
wg show told the story:
$ wg show wg0
interface: wg0
public key: ...
private key: (hidden)
listening port: 51820
peer: ...
endpoint: 37.46.x.x:1637
allowed ips: 0.0.0.0/0
latest handshake: 2 days, 11 hours ago
transfer: 4.5 MiB received, 12.3 MiB sent
Latest handshake two and a half days ago. WireGuard considers a peer dead after 180 seconds without a handshake. By "two days" the tunnel was thoroughly inert.
What I found
The VPN provider had rotated the endpoint IP without telling me. WireGuard doesn't auto-resolve endpoints by hostname unless you're using wg-quick with a hostname in the config, and even then it only re-resolves at interface up time. So the old endpoint IP in my config was hitting a dead address; the handshake retries were going into the void.
That's the cascade:
- Endpoint IP dead → no handshake → tunnel down.
- qBittorrent bound to
wg0→ all peer connections fail silently. - Every torrent stuck in
metaDL→ queue inflates to its cap. - Sonarr/Lidarr see "queue full" → stop searching for new releases.
- Library stops getting updates → that's the symptom I notice first.
The fix
Three lines:
# pull fresh config from provider
curl -o /etc/wireguard/wg0.conf "$VPN_CONFIG_URL"
wg-quick down wg0
wg-quick up wg0
Within thirty seconds: handshake fresh, DHT bootstrapped, downloads moving.
The longer-term fix is putting a hostname in the WireGuard config instead of a bare IP, plus a tiny watchdog cron:
#!/bin/bash
# /usr/local/bin/wg-watchdog
HANDSHAKE_AGE=$(wg show wg0 latest-handshakes | awk '{print '$(date +%s)' - $2}')
if [ "$HANDSHAKE_AGE" -gt 600 ]; then
logger "wg0 handshake stale (${HANDSHAKE_AGE}s), bouncing"
wg-quick down wg0 && wg-quick up wg0
fi
Runs every 5 minutes. If the handshake is older than 10 minutes, bounce the interface so it re-resolves the endpoint.
What I'd do differently
A bound-interface tunnel is a single point of failure for the application above it, and the failure is silent at the application layer. Either the watchdog has to exist, or the application needs to know how to alert when its bound interface goes dead. I now have both: the watchdog for fast recovery, and a Prometheus exporter on the WireGuard handshake age so I can graph it and alert if the watchdog ever fails to recover.