[OE-core] Build failure with parallel build and opkg

Stefan Agner stefan at agner.ch
Tue Sep 11 22:49:53 UTC 2018


Hi,

We experience build errors as follows every now and then:

...
ERROR: full-container-image-0.1-r0 do_populate_sdk: Unable to install
packages. Command
'/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/recipe-sysroot-native/usr/bin/opkg
--volatile-cache -f
/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/opkg.conf
-t
/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/temp/ipktemp/
-o
/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi
 --force_postinstall --prefer-arch-to-version   install 96boards-tools
aktualizr aktualizr-host-tools aktualizr-runtime-prov base-passwd
coreutils cpufrequtils docker gptfdisk haveged hostapd htop iptables
kernel-modules ldd less lmp-device-register networkmanager
networkmanager-nmtui openssh-sftp-server os-release ostree
packagegroup-base-extended packagegroup-core-boot
packagegroup-core-full-cmdline-extended
packagegroup-core-full-cmdline-multiuser
packagegroup-core-full-cmdline-utils packagegroup-core-ssh-openssh
packagegroup-core-standalone-sdk-target pciutils python3-compression
python3-distutils python3-docker python3-docker-compose python3-json
python3-netclient python3-pkgutil python3-shell python3-unixadmin rsync
run-postinsts shadow sshfs-fuse strace sudo target-sdk-provides-dummy
tcpdump vim-tiny' returned 255:
...
Downloading
file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/nss_3.38-r0_armv7at2hf-neon.ipk.
Removing corrupt package file
/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/8e392ecd3611e24a6a49a8b22ad6e1ff_nss_3.38-r0_armv7at2hf-neon.ipk.
...
Installing pam-plugin-faildelay (1.3.0) on root
Downloading
file:/workdir/oe/tmp/deploy/ipk/armv7at2hf-neon/pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
Removing corrupt package file
/workdir/oe/tmp/work/colibri_imx7-lmp-linux-gnueabi/full-container-image/0.1-r0/sdk/image/usr/local/tordy-x86_64/sysroots/armv7at2hf-neon-lmp-linux-gnueabi//var/cache/opkg/volatile/0df6a8bc594a581f6ca3bcfa55e860e2_pam-plugin-faildelay_1.3.0-r5_armv7at2hf-neon.ipk.
...
Collected errors:
 * opkg_install_pkg: Failed to download nss. Perhaps you need to run
'opkg update'?
 * opkg_install_pkg: Failed to download pam-plugin-faildelay. Perhaps
you need to run 'opkg update'?
.
...

We build our own OpenEmbedded core based distribution currently based on
a recent master state. But we have seen this on and off back since
rocko.

We build the image using Jenkins with multiple builders running in
parallel and sharing sstate. I think the fact that we run similar images
in parallel is the culprit: Looking closer at the failed build directory
reveals that the tmp-glibc/deploy/ipk/armv7at2hf-neon/Packages has a
different MD5Sum than the actual package. We start with two builders
simultaneously building an image, and it seems that they build the same
package around the same time. I assume that the two builders somehow
have a race between when the package get assembled and when the Package
index gets built...

We start with a clean sstate, and this typically only happens for the
very first builds, when the sstate is cold.

I guess there is some race/asynchronous operation going on around
building index/getting package from sstate/pushing package to sstate.

It seems an issue others have seen in the past too:
https://www.yoctoproject.org/irc/%23yocto.2018-07-05.log.html#t2018-07-05T10:07:25

Any idea?

--
Stefan





More information about the Openembedded-core mailing list