[OE-core] Re-execution of tasks - test report and results

Richard Purdie richard.purdie at linuxfoundation.org
Sat Mar 31 14:07:01 UTC 2012


As some people have noticed, there are some rebuild issues happening due
to sstate and the use of hashes in the stamp files. By this I mean the
case where due to some checksum change, some task gets rerun and the
task was not written to run a second time.

In other words all tasks are not idempotent (thanks Koen!) but should
be.

For the purposes of finding these tasks, we have the open bug:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=2123

and there are two proposed scripts there. One is a simple forced
re-execution of each task in turn. This catches some issues but not
others. I therefore wrote a second slower script which after forcing a
task, re-executes the current target to completion. The second script is
much slower than the first but finds different errors and better
pinpoints some others.

I've had my build machine iterating with the second script for a while
and its tested about 5,000 task re-executions and identified a number of
failures. Its not complete yet but its mostly there and I'm going to put
the failures in two groups:

a) Don't build at all:

alsa-tools.compile - need to check into/fix (local issue, works on AB?)
insserv-native.compile - need to check into/fix or delete
libx11-diet.compile - need to check into/fix
external-python-tarball - need to check into/fix
external-poky-toolchain - time to delete this recipe?
package-index - rpm package feed generation dependencies missing, has open bug
gobject-introspection - Known issue, not cross compile capable

Most of these are things excluded from world which have "fallen through
the cracks" or are known issues.

b) Failures in specific task re-execution

boost.boostconfig
boost.patch
docbook-utils-native.unpack
dropbear.debug_patch
eglibc-initial-nativesdk.patch
eglibc-initial.patch
eglibc-nativesdk.patch
eglibc.patch
gcc.configure
gcc-cross-canadian-i586.configure
gcc-cross-canadian-i586.headerfix
gcc-cross-canadian-i586.patch
gcc-cross-canadian-i586.unpack
gcc.headerfix
gcc.patch
gcc.unpack
man-pages.unpack
nasm-native.patch
nasm-native.patch_fixaclocal
nasm.patch
nasm.patch_fixaclocal
net-tools.patch
net-tools.unpack
perl.patch
python-native.patch
python-nativesdk.patch
python.patch
qt-x11-free.configure
qt-x11-free.generate_qt_config_file
qt-x11-free.patch
sgml-common-native.compile
unfs-server-native.configure
unfs-server-nativesdk.configure
wget.patch

To reproduce, just run "bitbake xxx -c cleansstate; bitbake xxx; bitbake
xxx -c yyy -f; bitbake xxx".

I've weeded out the false positives which were things like errors about
multiple providers changing bitbake's exit code. I also found building
libiconv totally destroyed the sysroot and caused iconv.h failures so I
blacklisted it.

Is there anything these tests won't find?

Sadly, yes :(

If you do something like "bitbake -c compile perl -f; bitbake git -c
compile -f", it breaks since there is a dependency there with the
timestamps that causes problems. Neither script above would conclusively
detect this, you might get lucky with the first one.

Secondly, in these tests we didn't check "does the output change?" since
we have no good tool to do this yet.

I'd propose we at least get the above issues identified fixed. People
can then report any other issues they run into and we fix them as we
find them...

We also need to go through Jiajun's list in the bugzilla carefully too
since I think there are some different issues being exposed there. Some
are duplicates of the above, some are not.

Cheers,

Richard






More information about the Openembedded-core mailing list