[OE-core] [PATCH 0/1] Create symbolic links atomically in the fetcher

Peter Kjellerstedt peter.kjellerstedt at axis.com
Tue Mar 28 12:30:42 UTC 2017


We have occasional failures in our autobuilders where a setscene task
fails, causing the original task to be run instead, but bitbake still
fails with an error code in the end, causing unnecessary grief. One
such case has been identified through the following error log:

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_python_func() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:do_package_write_rpm_setscene(d)
     0003:
File: '${COREBASE}/meta/classes/package_rpm.bbclass', lineno: 757, function: do_package_write_rpm_setscene
     0753:# but we need to stop the rootfs/solver from running while we do...
     0754:do_package_write_rpm[sstate-lockfile-shared] += "${DEPLOY_DIR_RPM}/rpm.lock"
     0755:
     0756:python do_package_write_rpm_setscene () {
 *** 0757:    sstate_setscene(d)
     0758:}
     0759:addtask do_package_write_rpm_setscene
     0760:
     0761:python do_package_write_rpm () {
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 648, function: sstate_setscene
     0644:            break
     0645:
     0646:def sstate_setscene(d):
     0647:    shared_state = sstate_state_fromvars(d)
 *** 0648:    accelerate = sstate_installpkg(shared_state, d)
     0649:    if not accelerate:
     0650:        raise bb.build.FuncFailed("No suitable staging package found")
     0651:
     0652:python sstate_task_prefunc () {
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 297, function: sstate_installpkg
     0293:    sstatefetch = d.getVar('SSTATE_PKGNAME', True) + '_' + ss['task'] + ".tgz"
     0294:    sstatepkg = d.getVar('SSTATE_PKG', True) + '_' + ss['task'] + ".tgz"
     0295:
     0296:    if not os.path.exists(sstatepkg):
 *** 0297:        pstaging_fetch(sstatefetch, sstatepkg, d)
     0298:
     0299:    if not os.path.isfile(sstatepkg):
     0300:        bb.note("Staging package %s does not exist" % sstatepkg)
     0301:        return False
File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 635, function: pstaging_fetch
     0631:    for srcuri in uris:
     0632:        localdata.setVar('SRC_URI', srcuri)
     0633:        try:
     0634:            fetcher = bb.fetch2.Fetch([srcuri], localdata, cache=False)
 *** 0635:            fetcher.download()
     0636:
     0637:            # Need to optimise this, if using file:// urls, the fetcher just changes the local path
     0638:            # For now work around by symlinking
     0639:            localpath = bb.data.expand(fetcher.localpath(srcuri), localdata)
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1572, function: download
     1568:                    localpath = ud.localpath
     1569:                elif m.try_premirror(ud, self.d):
     1570:                    logger.debug(1, "Trying PREMIRRORS")
     1571:                    mirrors = mirror_from_string(self.d.getVar('PREMIRRORS', True))
 *** 1572:                    localpath = try_mirrors(self, self.d, ud, mirrors, False)
     1573:
     1574:                if premirroronly:
     1575:                    self.d.setVar("BB_NO_NETWORK", "1")
     1576:
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1020, function: try_mirrors
     1016:
     1017:    uris, uds = build_mirroruris(origud, mirrors, ld)
     1018:
     1019:    for index, uri in enumerate(uris):
 *** 1020:        ret = try_mirror_url(fetch, origud, uds[index], ld, check)
     1021:        if ret != False:
     1022:            return ret
     1023:    return None
     1024:
File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 978, function: try_mirror_url
     0974:            if os.path.islink(origud.localpath):
     0975:                # Broken symbolic link
     0976:                os.unlink(origud.localpath)
     0977:
 *** 0978:            os.symlink(ud.localpath, origud.localpath)
     0979:        update_stamp(origud, ld)
     0980:        return ud.localpath
     0981:
     0982:    except bb.fetch2.NetworkAccess:
Exception: OSError: [Errno 17] File exists

What happens here is that two tasks simultaneously decide to download
something, and both come to the conclusion that they need to create a
symbolic link. And even if there is a check for whether the link
already exists, there is a small window of time where both tasks see
the missing link and tries to create it with the result that the
second task fails as per above.

The change provided here causes the link creation to be made in an
atomic way so that even if two tasks actually do decide that they need
to create the same link, neither of them will fail.

I do not know if this solves the same problem that is solved by commit
b8b14d975a254444461ba857fc6fb8c725de8874 on the master-next branch in
the bitbake repository. Since I have no way to recreate the failure in
a controlled way, I cannot test if the change on the master-next
branch also solves it or not. Its description does not exactly match
our situation (we do not map file:// URLs to http:// URLs in our
SSTATE_MIRRORS), but maybe someone with better knowledge of the code
can tell if either or both changes are needed.

//Peter

The following changes since commit 415b72ffcbd26e5f3664370d8b2a9b8105fb6342:

  dnf: remove systemd units in nativesdk builds (2017-03-28 10:34:37 +0100)

are available in the git repository at:

  git://git.yoctoproject.org/poky-contrib pkj/atomic_symlinks
  http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=pkj/atomic_symlinks

Peter Kjellerstedt (1):
  fetch2: Create/replace symbolic links atomically

 bitbake/lib/bb/fetch2/__init__.py | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.12.0




More information about the Openembedded-core mailing list