Sonarr was dead because of a Cyrillic filename
Sonarr's web UI was unresponsive. systemd said the service was active (running). The process was using 100% of multiple cores. Load average on the container was over 23 on 24 cores. So it wasn't dead, just refusing to do anything useful.
What was happening
UI request: hangs forever. API request: hangs forever. journalctl -u sonarr -f: a wall of System.IO.PathTooLongException every 90 seconds or so, stack traces dumping into the logs as fast as they could.
The sonarr.db SQLite Logs table was several hundred MB and growing. The HTTP listener thread was being starved by whatever the exception loop was doing.
What I found
A single qBittorrent torrent was the source. Russian release group, Cyrillic title, double-encoded somewhere in the chain so the resulting path was well past .NET's MAX_PATH limit. Every time Sonarr's import scanner walked the qBittorrent download list, it tried to parse the torrent name, hit the long-path exception, logged the full stack trace, and moved on. The scanner runs every 90 seconds. Each pass added thousands of log rows.
Sonarr wasn't crashing — it was choking on its own log volume. The Logs table writes were holding the SQLite write lock long enough that any UI request waiting on the DB just sat.
The fix
Find the bad torrent and kill it from qBittorrent's side:
# get the hash from sonarr logs
journalctl -u sonarr | grep PathTooLong | head -1
# delete from qbittorrent with files
curl -X POST "$QBT/api/v2/torrents/delete" \
-d "hashes=14de03680df2d6847da2237e2abd522c16a2e103&deleteFiles=true"
# restart sonarr
systemctl restart sonarr
Response times dropped from "timeout" to 5–7 ms within a minute. Then I added prevention so this doesn't repeat: a Sonarr release profile with Must Not Contain rules:
/[Ѐ-ӿ]/— any Cyrillic codepoint/.{200,}/— any 200+ char run- A few release group names that were repeat offenders
Sonarr falls through to the next-best release on its own once those grab-time blocks are in place.
What I'd do differently
Lesson I keep relearning: when a process is "alive but doing nothing," check the load average and look for log floods before assuming it's a memory or thread-pool problem. A tight exception loop in a logged-everything app looks identical to a hang from the outside, but the fix is completely different. Cheap detection: journalctl --since '10 minutes ago' | wc -l and compare to a known baseline.