[OE-core] [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition

Bruce Ashfield bruce.ashfield at gmail.com
Tue Dec 15 14:42:12 UTC 2015


On Tue, Dec 15, 2015 at 9:04 AM, Paul Barker <paul.barker at commagility.com>
wrote:

> On Sun, 6 Dec 2015 11:26:33 +0000
> Paul Barker <paul.barker at commagility.com> wrote:
>
> > I ran into a race condition building multiple external modules against a
> 3.10.y
> > series kernel using the dylan branch of OpenEmbedded. This is difficult
> to
> > reproduce as it requires very specific timing: the do_make_scripts task
> for one
> > module was linking the modpost script whilst the do_compile task for
> another
> > module was attempting to use the modpost script. This resulted in a
> permission
> > error:
> >
> > ERROR: Function failed: do_compile (see
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> for further information)
> > ERROR: Logfile of failure stored in:
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> > Log data follows:
> > | DEBUG: Executing shell function do_compile
> > | make -C
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel
> M=$PWD clean
> > | make[1]: Entering directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make[1]: Leaving directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make -C
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel
> M=$PWD modules
> > | make[1]: Entering directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > |   CC [M]
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
> > |   Building modules, stage 2.
> > |   MODPOST 1 modules
> > | /bin/sh: scripts/mod/modpost: Permission denied
> > | make[2]: *** [__modpost] Error 126
> > | make[1]: *** [modules] Error 2
> > | make[1]: Leaving directory
> `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make: *** [default] Error 2
> > | ERROR: Function failed: do_compile (see
> /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> for further information)
> > ERROR: Task 1284
> (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/
> ti-hplib-mod_git.bb, do_compile) failed with exit code '1'
> >
> > Later kernel versions do not rebuild the modpost script every time that
> 'make
> > scripts' is invoked so they should be safe from this particular failure.
> However
> > I'm not convinced that running 'make scripts' whilst also building an
> > out-of-tree module is always safe on later kernels and there is always
> the
> > potential for vendor kernels to have different behaviour here.
> >
> > Although this was seen on the dylan branch the behaviour of master and
> jethro
> > looks to be the same here - do_make_scripts is locked so that only one
> instance
> > of it may run at one time but there is nothing to prevent one instance of
> > do_make_scripts running at the same time as an instance of do_compile.
> >
> > The patch I'm sending attempts to solve this issue by locking the
> do_compile
> > task with the same lockfile as the do_make_scripts task in
> module.bbclass so
> > that an instance of do_copile can't run at the same time as an instance
> of
> > do_make_scripts. I don't know enough about the task locking to guarantee
> that
> > this is the right solution or to be able to test that it works as
> expected so
> > I'm marking the patch as an RFC.
> >
> > Please let me know if this is the right approach and if there is any
> easy way to
> > test this.
> >
> > Paul Barker (1):
> >   module.bbclass: Fix potential do_compile/do_make_scripts race
> >     condition
> >
> >  meta/classes/module.bbclass | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
>
> ping on this.
>

Sorry. I was traveling when this landed .. made a mental note .. and then
never
looped around.


>
> I've just got bitten by this again so it's not a one-off. Is anyone able to
> give me some feedback on the patch, whether this is the right approach to
> fix
> the problem and whether this is applicable to jethro/master.
>

The approach makes sense to me, and it was what I was considering for
generating
symbols after do_compile_modules. As long as it isn't serializing a huge
part of
the build, the impacts are even measurable.

So this change looks sane to me.

Bruce




>
> Thanks,
>
> --
> Paul Barker
> CommAgility Ltd
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core at lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
>



-- 
"Thou shalt not follow the NULL pointer, for chaos and madness await thee
at its end"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openembedded.org/pipermail/openembedded-core/attachments/20151215/dc5ef7d5/attachment-0002.html>


More information about the Openembedded-core mailing list