[OE-core] SetScene tasks hang forever?

Richard Purdie richard.purdie at linuxfoundation.org
Wed May 2 23:06:34 UTC 2012


On Wed, 2012-05-02 at 14:48 -0500, Mark Hatle wrote:
> On 5/2/12 2:45 PM, Rich Pixley wrote:
> > On 5/2/12 12:40 , Mark Hatle wrote:
> >> On 5/2/12 2:16 PM, Rich Pixley wrote:
> >>> On 5/2/12 11:40 , Mark Hatle wrote:
> >>>> On 5/2/12 1:21 PM, Rich Pixley wrote:
> >>>>> I'm seeing a lot of builds apparently hanging forever, (the ones that
> >>>>> work seem to work within seconds - the ones that hang seem to hang for
> >>>>> at least 10's of minutes), with:
> >>>>>
> >>>>> rich at dolphin>      nice tail -f Log
> >>>>> MACHINE           = "qemux86"
> >>>>> DISTRO            = ""
> >>>>> DISTRO_VERSION    = "oe-core.0"
> >>>>> TUNE_FEATURES     = "m32 i586"
> >>>>> TARGET_FPU        = ""
> >>>>> meta              = "master:35b5fb2dd2131d4c7dc6635c14c6e08ea6926457"
> >>>>>
> >>>>> NOTE: Resolving any missing task queue dependencies
> >>>>> NOTE: Preparing runqueue
> >>>>> NOTE: Executing SetScene Tasks
> >>>>>
> >>>>> If I run top, I see one processor pinned at 98 - 99% utilization running
> >>>>> python, but no other clues.
> >>>>>
> >>>>> Can anyone point me to doc, explain what's going on here, or point me in
> >>>>> the right direction to debug this?
> >>>> The only time I've seen "hang-like" behavior the system actually opened a
> >>>> devshell and was awaiting input.   But based on your log, it doesn't look like
> >>>> that is the case.
> >>>>
> >>>> Run bitbake with -DDD option, you will get considerably more debug information
> >>>> and it might help point out what it thinks it is doing.
> >>> NOTE: Executing SetScene Tasks
> >>> DEBUG: Stamp for underlying task
> >>> 12(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/opkg/opkg_svn.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 16(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/opkg-utils/opkg-utils_git.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 20(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/makedevs/makedevs_1.0.0.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 24(/home/rich/projects/webos/openembedded-core/meta/recipes-core/eglibc/ldconfig-native_2.12.1.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 32(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/genext2fs/genext2fs_1.4.1.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 36(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/e2fsprogs/e2fsprogs_1.42.1.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 40(virtual:native:/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/qemu/qemu_0.15.1.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>> DEBUG: Stamp for underlying task
> >>> 44(/home/rich/projects/webos/openembedded-core/meta/recipes-devtools/qemu/qemu-helper-native_1.0.bb,
> >>> do_populate_sysroot) is current, so skipping setscene variant
> >>>
> >>> And then the spinning hang.
> >> Sorry, I don't know how to continue debugging what might be wrong.  The only
> >> other thing I can suggest is check that your filesystem is "real", not a
> >> netapp/nfs/network emulated filesystem....
> >>
> >> And if you were continuing a previous build, start a new build directory and
> >> retry it.
> > Local file system.  I'm building a second time expecting a null build
> > pass.  I was able to get a null build pass in the same directory yesterday.
> >
> > Removing my build directory and starting over has been working, but
> > costs me a few hours each time, and this happens frequently enough that
> > I get no other work done.  :(.
> 
> Ya, that is certainly not acceptable.  If you could file a bug on the 
> bugzilla.yoctoproject.org someone might be able to help you diagnose this 
> further and hopefully figure out a fix.

What would really help is a way to reproduce this...

Does it reproduce with a certain set of metadata/sstate perhaps?

What is odd about the above logs is that it appears bitbake never
executes any task. Its possible something might have crashed somewhere I
guess and not realise part of the system had died. Or it could be some
kind of circular dependency loop where X needs Y to build and Y needs X
so nothing happens. We are supposed to spot and error if that would have
happened.

Does strace give an idea of which bits of bitbake are alive/looping? I'd
probably resort to a few print()/bb.error() in the code at this point to
find out what is alive, what is dead and where its looping...

Cheers,

Richard





More information about the Openembedded-core mailing list