Years ago, I had a problem keeping a long-running rsync process alive longer than ~2 hours between two ReadyNAS boxes. Many suggested fixes were offered by peers, and I did a lot of my own research. It didn’t appear to be a problem with the ssh tunnel, since I could keep ssh open and idle for a whole day. I even upgraded the NAS boxes to DiskStations improve the situation a bit, as the faster CPUs were able to do more of the CPU-bound work in the ~2 hours of life the process had. Still, I kept hitting the limits of my setup, and I had to prune out old backup directories more often than I liked.
I recently came across a stackoverflow post where someone casually mentioned to check logrotate. I dismissed this, of course – logrotate was working fine on my rsync logs in /var/log.
But I kept revisiting it. Why did they suggest looking at logrotate? On a whim, I tried moving the rsync logging out of /var/log. That alone fixed the problem. My rsync process now runs until it’s done! Somehow, logrotate was configured on my DiskStation to kill off rsync periodically so that it could perform its log-rotating duties. It’d be a rabbit hole to explore exactly why or how this happened – I’m just happy that it’s working now 🙂
If you’re having trouble with long-running rsync processes, definitely give this a try.