[bitbake-devel] [PATCH] process: Improve exit handling and hangs

Jason Wessel jason.wessel at windriver.com
Tue Aug 27 20:11:43 UTC 2013


On 08/24/2013 07:07 AM, Richard Purdie wrote:
> It turns out we have a number of different ways the process server termination can
> hang. If we call cancel_join_thread() on the event queue, it means that it can be left
> containing partial data. This means the reading of the event queue in the terminate()
> function can hang, the timeout and block parameters to Queue.get() don't make any
> difference.
>
> Equally, if we don't call cancel_join_thread(), the join_thread in terminate()
> will hang giving a different deadlock.
>
> The best solution I could find is to loop over the process is_alive() after requesting
> it stops,  trying to join the thread and if that fails, try and flush the event
> queue again.
>
> It wasn't clear what difference a force option should make in this case, we're
> gracefully trying to empty queues and shut down regardless of whether its a SIGTERM
> so I've simply removed the force option.
>
> Signed-off-by: Richard Purdie <richard.purdie at linuxfoundation.org>
> ---
>
> Jason: Not sure if this or the other patch will help the hang you are
> seeing or not but they seem like good changes regardless and fix real
> world issues.


Certainly the behavior is a bit better with your additional patches, but it is not the root of the problem.  I have tested your patches in the heavy load situations where we have observed all the hangs.

I'll send a patch separately along with an explanation of the root cause of the hangs in the PR Server.

As for your patches, I have reviewed and tested them:  Acked-by: Jason Wessel <jason.wessel at windriver.com>

Cheers,
Jason.




More information about the bitbake-devel mailing list