[OE-core] Long delays with latest bitbake (was: [PATCH 1/7] insane.bbclass: in file-rdeps do not look into RDEPENDS recursively)

Peter Kjellerstedt peter.kjellerstedt at axis.com
Wed Aug 14 14:57:28 UTC 2019


> -----Original Message-----
> From: richard.purdie at linuxfoundation.org
> <richard.purdie at linuxfoundation.org>
> Sent: den 14 augusti 2019 14:56
> To: Alexander Kanavin <alex.kanavin at gmail.com>
> Cc: Peter Kjellerstedt <peter.kjellerstedt at axis.com>; Khem Raj
> <raj.khem at gmail.com>; OE-core <openembedded-
> core at lists.openembedded.org>
> Subject: Re: [OE-core] Long delays with latest bitbake (was: [PATCH
> 1/7] insane.bbclass: in file-rdeps do not look into RDEPENDS
> recursively)
> 
> On Wed, 2019-08-14 at 14:08 +0200, Alexander Kanavin wrote:
> > On Wed, 14 Aug 2019 at 13:36, <richard.purdie at linuxfoundation.org>
> > wrote:
> > > On Wed, 2019-08-14 at 13:25 +0200, Alexander Kanavin wrote:
> > > > On Tue, 13 Aug 2019 at 21:18, Richard Purdie <
> > > > richard.purdie at linuxfoundation.org> wrote:
> > > > > I had a glance at the profile output from master-next and the
> > > > > problem
> > > > > wasn't where I thought it would be, it was in the scheduler
> > > code.
> > > > > That
> > > > > is good as those classes are effectively independent of the
> > > other
> > > > > changes and hence are a separate fix.
> > > > >
> > > > > I've put a patch in -next which takes the above test to 36s
> > > which
> > > > > is
> > > > > close to the older bitbake.
> > > > >
> > > > > Could be interesting to see how it looks for others and
> > > different
> > > > > workloads.
> > > >
> > > > I just tried the same test I did yesterday with
> > > > ab56d466452148e5fce330d279d13e2495eceb1f. Unfortunately it
> > > doesn't
> > > > seem to improve things much: bitbake is stuck at "NOTE: Executing
> > > > Tasks" for 15 minutes now.
> > >
> > > This might sound slightly crazy but can you try commenting out this
> > > line in runqueue.py:
> > >
> > > logger.debug(2, "Holding off tasks %s" %
> > > pprint.pformat(self.holdoff_tasks))
> > >
> > > ?
> >
> > Even crazier is the outcome: it helped!
> 
> Cool, I think I can explain it.
> 
> The holdoff_tasks list can contain a list of nearly all the tasks at
> some points in execution. Even though the debug messages aren't being
> printed on the console, they are being sent over the internal IPC bus
> between the cooker, UI and other event handlers. Obviously for small
> task lists its not a problem, for large ones its multiple 4k chunks
> over pipes which isn't going to be fast.
> 
> We have done a lot of optimisation in the past but its all too easy to
> trend on something like this and upset things :/.
> 
> > The whole thing completed after 15m49secons (with much of the time
> > going to the empty task spin), that's some 3 minutes slower, but
> > certainly it's usable again.
> 
> You followed up mentioning this wasn't with master-next. I think there
> is a patch in -next which will help with the empty task spin so both
> together might get us back to more normal numbers.
> 
> Cheers,
> 
> Richard

I can just confirm that removing the debug line removes almost all of 
the delays I was seeing. Here are some time statistics from my builds:

6c7c0cef: 02:50
7df31ff3: 06:13  (first attempt)
a0542ed3: 06:32  (master)
19a88c68: 43:19* (~yesterday's master-next)
b0a0e4a6: 06:55  (master-next)
no debug: 03:04  (master-next + removal of "Holding off tasks" debug)

* I aborted this build after about half of the 12540 tasks were done...

One thing I have noticed while I was doing the timings is that I do 
not seem to be able to kill the bitbake server with bitbake -m after 
the builds have completed, but have to resort to using kill -9 most 
of the time... 

//Peter



More information about the Openembedded-core mailing list