I have recently rendered four animation sequences that were all about 240 frames in length (10 seconds) at 720P resolution with 500 samples per frame and always rendered on two eGPUs running together at 100% utilization. My scenes have been reliable during the animation creation process and test rendering phase. No shader errors. Every frame will render just fine individually. There is no reason for it to stop at multiples of 124 ish frames.
I ran three of the animations via the macOS command line. blender -b path/to/file.blend -a having already saved the Blender file with the right output directory, overwrite setting and eGPU.
The fourth animation I rendered via the User Interface Animate button and watched the render on screen.
In all instances the render output gets corrupted at Frame 124 (or thereabouts within one or two frames). One of the animations was 290 Frames and so it corrupted at F124 and then again at F250 once it was restarted.
The corruption appears as a multi coloured image or as a totally transparent PNG output file. The render continues to it's conclusion with much longer time between each frame at the initilization stage but no further frames are rendered correctly. There is no error given.
The command line as it went wrong. Notice the gap in the GPU history. The tone of the eGPU changes too when frame 124/125 happens.
It is simple to kill the command line version of Blender but the UI version of Blender becomes extremely difficult to Force Quit at frame 124.
There doesn't seem to be any reason why this should happen but it happens on every animation that is longer than 124 frames!!
Hope the screenshots help.
I didn't want to test a "weedy" test project. But my little animations are not really very complex. They all have DOF and standard lighting and as I said do not crash Blender when rendering any single frame from the entire animation. Until you want to batch render the entire animation.
The resulting frame.
Then again at F250 it happens again.
A different scene. Rendered via Blender UI, not command line: F123
Blender is unresponsive and it is difficult to force quit.
Some sample "corrupted" frames.
I've added a test file I'm working on but at really low quality settings (50 samples per Frame) it does not cause the bug.
I haven't had chance to test if setting it to 500+ samples per frame will trigger the bug but if you've got a test rig available it'd be worth cranking up the settings or adding a PBR texture to the cube to see if that'll break it?
As I figured in my original post it needs a "beefy" animation to properly task the GPUs before it'll happen.
I will post my Soft Ball animation that definitely caused the bug as well now.