[OE-core] [PATCH 1/1] classes/sanity: fix handling of bblayers.conf updating

Martin Jansa martin.jansa at gmail.com
Tue Apr 16 21:41:08 UTC 2013


On Tue, Apr 16, 2013 at 10:06:06PM +0100, Richard Purdie wrote:
> On Tue, 2013-04-16 at 19:09 +0200, Martin Jansa wrote:
> > OK, so it's different than hangs I'm seeing here, because I'm still seeing them with latest oe-core
> 
> Yes, that was almost certainly a different issue.
> 
> > $ . ./oe-init-bitbake-build-env && bitbake my-image
> > Altered environment for machine at distro development (this is from oe-init-bitbake-build-env)
> > # and nothing else is shown after that
> 
> So just to ensure I understand this, it never prints any messages at
> all?

yes nothing at all from bitbake, only that echo from setup script
(oe-init-bitbake-build-env)

> > ps auxf (shortened paths and ascii tree)
> > 
> > bitbake  17625  0.0  0.0   4396   612 pts/18   S+   18:31   0:00  \_ /bin/sh -c . ./oe-init-bitbake-build-env && bitbake  my-image
> > bitbake  17626  0.0  0.0   4396   700 pts/18   S+   18:31   0:00     \_ /bin/sh /OE/build/oe-core/scripts/bitbake my-image
> > bitbake  17656  0.2  1.0 219356 168120 pts/18  S+   18:31   0:02        \_ python /OE/build/bitbake/bin/bitbake my-image
> > bitbake  18504  0.0  1.0 289044 166556 pts/18  Sl+  18:31   0:00           \_ python /OE/build/bitbake/bin/bitbake my-image
> > 
> > Not sure if there is better way to see what's going on in those python processes, here is strace:
> > 
> > # strace -p 17656
> > Process 17656 attached
> > wait4(18504, 
> > 
> > # strace -p 18504
> > Process 18504 attached
> > recvfrom(12, 
> > 
> > Hitting Ctrl+C with strace still running:
> > # strace -p 17656
> > Process 17656 attached
> > wait4(18504, 0x7fffd2abdf80, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=12337, ptr=0x3031}} ---
> > wait4(18504,
> > 
> > Process 18504 attached
> > recvfrom(12, 0xb773684, 8192, 0, 0, 0)  = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=12337, ptr=0x3031}} ---
> > rt_sigreturn()                          = -1 EINTR (Interrupted system call)
> > futex(0xb16e5a0, FUTEX_WAIT_PRIVATE, 0, NULL
> > 
> > More Ctrl+C only repeats it in both processes but bitbake does not quit
> > 
> > One more python process is created and disowned by parent and becames child of init and dies later
> > bitbake  18524  0.0  1.0 289044 165128 ?       S    18:31   0:00 python /OE/build/bitbake/bin/bitbake my-image
> > but bitbake is still hanging until I send kill 18504 from another terminal
> > 
> > Process 18504:
> > futex(0xb16e5a0, FUTEX_WAIT_PRIVATE, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=12337, ptr=0x3031}} ---
> > rt_sigreturn()                          = -1 EINTR (Interrupted system call)
> > futex(0xb16e5a0, FUTEX_WAIT_PRIVATE, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=23837, si_uid=1026} ---
> > +++ killed by SIGTERM +++
> > 
> > Process 17656: (iirc you said that python-logging module is not safe with multiple processes and it looks like logging something)
> > # strace -p 17656
> > Process 17656 attached
> > wait4(18504, 0x7fffd2abdf80, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=12337, ptr=0x3031}} ---
> > wait4(18504, 0x7fffd2abdf80, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
> > --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=12337, ptr=0x3031}} ---
> > 
> > read(4, "\200\2clogging\nLogRecord\nq\1)\201q\2}q\3(U"..., 554) = 554
> > select(5, [4], NULL, NULL, {0, 0})      = 1 (in [4], left {0, 0})
> > read(4, "\0\0\2\34", 4)                 = 4
> 
> These look like pickled python LogRecord objects being passed between
> the processes. If this happens again, pass the -s 8912 option to strace
> (increases the size of the string dumped) and lets see if we can get the
> complete log record. We should be able to unpickle them and get some
> idea what log messages it was passing around at least...

OK will do

> > I'll put this in bug report later, but if you see what other information could be useful,
> > I'll add it next time I see bitbake hanging like this (it doesn't happen every time, probably more
> > with machine under higher load).
> 
> Ideally we need to turn this into a 100% reproducer somehow. Failing
> that, perhaps more of the LogRecord can give us some clue what is going
> on. I'm struggling for other ideas...

I would say it's not depending on metadata, because build works fine if
I kill bitbake and start it again. I've also seen it in very different
layer setups (including different DISTROs and IIRC also with distro-less
oe-core only).

Thinking about it it maybe happens more often when tmp-eglibc is empty,
I'll try couple of builds with tmp-eglibc removed to see if I can
reproduce it faster then every 10th build or so.

-- 
Martin 'JaMa' Jansa     jabber: Martin.Jansa at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.openembedded.org/pipermail/openembedded-core/attachments/20130416/2dbbb8ec/attachment-0002.sig>


More information about the Openembedded-core mailing list