[OE-core] [PATCH 0/1] Create symbolic links atomically in the fetcher
Peter Kjellerstedt
peter.kjellerstedt at axis.com
Tue Mar 28 12:30:42 UTC 2017
We have occasional failures in our autobuilders where a setscene task
fails, causing the original task to be run instead, but bitbake still
fails with an error code in the end, causing unnecessary grief. One
such case has been identified through the following error log:
The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_python_func() autogenerated', lineno: 2, function: <module>
0001:
*** 0002:do_package_write_rpm_setscene(d)
0003:
File: '${COREBASE}/meta/classes/package_rpm.bbclass', lineno: 757, function: do_package_write_rpm_setscene
0753:# but we need to stop the rootfs/solver from running while we do...
0754:do_package_write_rpm[sstate-lockfile-shared] += "${DEPLOY_DIR_RPM}/rpm.lock"
0755:
0756:python do_package_write_rpm_setscene () {
*** 0757: sstate_setscene(d)
0758:}
0759:addtask do_package_write_rpm_setscene
0760:
0761:python do_package_write_rpm () {
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 648, function: sstate_setscene
0644: break
0645:
0646:def sstate_setscene(d):
0647: shared_state = sstate_state_fromvars(d)
*** 0648: accelerate = sstate_installpkg(shared_state, d)
0649: if not accelerate:
0650: raise bb.build.FuncFailed("No suitable staging package found")
0651:
0652:python sstate_task_prefunc () {
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 297, function: sstate_installpkg
0293: sstatefetch = d.getVar('SSTATE_PKGNAME', True) + '_' + ss['task'] + ".tgz"
0294: sstatepkg = d.getVar('SSTATE_PKG', True) + '_' + ss['task'] + ".tgz"
0295:
0296: if not os.path.exists(sstatepkg):
*** 0297: pstaging_fetch(sstatefetch, sstatepkg, d)
0298:
0299: if not os.path.isfile(sstatepkg):
0300: bb.note("Staging package %s does not exist" % sstatepkg)
0301: return False
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 635, function: pstaging_fetch
0631: for srcuri in uris:
0632: localdata.setVar('SRC_URI', srcuri)
0633: try:
0634: fetcher = bb.fetch2.Fetch([srcuri], localdata, cache=False)
*** 0635: fetcher.download()
0636:
0637: # Need to optimise this, if using file:// urls, the fetcher just changes the local path
0638: # For now work around by symlinking
0639: localpath = bb.data.expand(fetcher.localpath(srcuri), localdata)
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1572, function: download
1568: localpath = ud.localpath
1569: elif m.try_premirror(ud, self.d):
1570: logger.debug(1, "Trying PREMIRRORS")
1571: mirrors = mirror_from_string(self.d.getVar('PREMIRRORS', True))
*** 1572: localpath = try_mirrors(self, self.d, ud, mirrors, False)
1573:
1574: if premirroronly:
1575: self.d.setVar("BB_NO_NETWORK", "1")
1576:
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1020, function: try_mirrors
1016:
1017: uris, uds = build_mirroruris(origud, mirrors, ld)
1018:
1019: for index, uri in enumerate(uris):
*** 1020: ret = try_mirror_url(fetch, origud, uds[index], ld, check)
1021: if ret != False:
1022: return ret
1023: return None
1024:
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 978, function: try_mirror_url
0974: if os.path.islink(origud.localpath):
0975: # Broken symbolic link
0976: os.unlink(origud.localpath)
0977:
*** 0978: os.symlink(ud.localpath, origud.localpath)
0979: update_stamp(origud, ld)
0980: return ud.localpath
0981:
0982: except bb.fetch2.NetworkAccess:
Exception: OSError: [Errno 17] File exists
What happens here is that two tasks simultaneously decide to download
something, and both come to the conclusion that they need to create a
symbolic link. And even if there is a check for whether the link
already exists, there is a small window of time where both tasks see
the missing link and tries to create it with the result that the
second task fails as per above.
The change provided here causes the link creation to be made in an
atomic way so that even if two tasks actually do decide that they need
to create the same link, neither of them will fail.
I do not know if this solves the same problem that is solved by commit
b8b14d975a254444461ba857fc6fb8c725de8874 on the master-next branch in
the bitbake repository. Since I have no way to recreate the failure in
a controlled way, I cannot test if the change on the master-next
branch also solves it or not. Its description does not exactly match
our situation (we do not map file:// URLs to http:// URLs in our
SSTATE_MIRRORS), but maybe someone with better knowledge of the code
can tell if either or both changes are needed.
//Peter
The following changes since commit 415b72ffcbd26e5f3664370d8b2a9b8105fb6342:
dnf: remove systemd units in nativesdk builds (2017-03-28 10:34:37 +0100)
are available in the git repository at:
git://git.yoctoproject.org/poky-contrib pkj/atomic_symlinks
http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=pkj/atomic_symlinks
Peter Kjellerstedt (1):
fetch2: Create/replace symbolic links atomically
bitbake/lib/bb/fetch2/__init__.py | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
--
2.12.0
More information about the Openembedded-core
mailing list