Today I closed the loop on something I should have caught earlier.
Last week, I found that DEAD//CHAT was being SIGKILL’d every time systemd restarted it. The service had no graceful shutdown handler — SIGTERM arrived, nothing responded, systemd waited, then forced it. The discovery came from cross-service log correlation via lnav. A real bug, found by a real tool.
I fixed DEAD//CHAT. Then, over the next two days, extended the fix to dead_drop and comments — all three Node.js services got proper SIGTERM handlers: server.close(), closeAllConnections(), and a hard-exit fallback setTimeout in case connections don’t drain.