[OE-core] Mis-generation of shell script (run.do_install)?

Jason Andryuk jandryuk at gmail.com
Wed Jan 16 20:20:49 UTC 2019


On Wed, Jan 16, 2019 at 9:02 AM Richard Purdie
<richard.purdie at linuxfoundation.org> wrote:
>
> On Wed, 2019-01-16 at 08:55 -0500, Jason Andryuk wrote:
> > On Tue, Jan 8, 2019 at 1:26 PM <richard.purdie at linuxfoundation.org>
> > wrote:
> > > On Tue, 2018-12-18 at 12:45 -0500, Jason Andryuk wrote:
> > > > I can definitively state I have a hash in bb_codeparser.dat with
> > > > an
> > > > incorrect shellCacheLine entry and I don't know how it got there.
> > > >
> > > > The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached
> > > > a
> > > > file with the binutils do_install() contents which hash to that
> > > > value.
> > > >
> > > > The bad 3df9018676de219bb3e46e88eea09c98 entry in the
> > > > bb_codeparser.dat returned
> > > > DEBUG: execs [
> > > > DEBUG: execs rm
> > > > DEBUG: execs install
> > > > DEBUG: execs test
> > > > DEBUG: execs sed
> > > > DEBUG: execs rmdir
> > > > DEBUG: execs bbfatal_log
> > > > DEBUG: execs mv
> > > > DEBUG: execs /home/build/openxt-compartments/build/tmp-
> > > > glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-
> > > > sysroot-
> > > > native/usr/bin/python-native/python
> > > > DEBUG: execs find
> > >
> > > This is useful data (along with the attachment), thanks.
> > >
> > > I agree that this looks likely to have come from a core2-32 tuned
> > > machine (e.g. genericx86) from python-async do_install.
> > >
> > > How old was this build directory? Can you remember any details of
> > > the
> > > update history for it?
> >
> > I think the build directory was from the beginning of October 30th,
> > and I guess I hit the collision December 10th or so.
> >
> > > I'd be very interested to try and reproduce that hash. I locally
> > > blacklisted your collision from my cache and tried to reproduce
> > > this. I
> > > can generate a matching hash for the binutils do_install but I
> > > can't
> > > produce one matching the above.
> >
> > I tried around December 18th to generate the collision again.  I set
> > up a new container with an identical openxt path.  There, python-
> > async was built, but it did not have the colliding hash.  When core2-
> > 64 binutils was built, it had the expected hash.
> >
> > > Can you remember the history of this build directory and which
> > > updates
> > > it may have had? The python-async recipe is confined to OE-Core so
> > > its
> > > probably the revision history for the oe-core repo which is most
> > > interesting. Anything in the .git/logs directory for that which
> > > would
> > > help us replay the different versions you might have built?
> >
> > oe-core is checked out at 819aa151bd634122a46ffdd822064313c67f5ba5
> > It's a git submodule locked at a fixed revision, and it had not
> > changed in the build directory.
> >
> > OpenXT builds 8 or 9 different MACHINEs and images in sequence in the
> > same build directory.  Maybe 6 are core2-32 and two are core2-64. The
> > 32bit ones run first.
>
> The hash we don't have is from a core2-32 MACHINE. I'm wondering which
> configurations you might have parsed for a core2-32 MACHINE between
> October and December?

Which "configurations" are you asking about?

The standard OpenXT build loops through building all 8 images and
packaging them up into an installer iso.  Often I run that build
script, but sometimes I just build individual machines manually.

I was mainly working on the core2-64 machines immediately prior to
this event.  I was very surprised when it occured since 1) I didn't
expect binutils to be re-built and 2) I wasn't working on the
openxt-installer machine which failed.

> Was TMPDIR ever cleaned? If not, do you have the python-async WORKDIR
> for core2-32? The TMPDIR/logs directory may also have useful hints
> about the configurations built...

Unfortunately, yes, I cleaned TMPDIR when I hit the build error.  Same
with the sstate-cache.

In general, I don't see python-async in TMPDIR after running through
the OpenXT build.  Would that be because an early machine builds
python-async, but then it gets cleared out of TMPDIR when a later
machine/image are built?

> > I think the problem first manifest after I added an additional local
> > layer to BBLAYERS.  At that time, I started building an additional
> > MACHINE.  Along with the mis-generated run.do_install script, bitbake
> > was complaining about the binutils base hash mismatch which triggered
> > the re-build.  The first 64bit MACHINE included TUNE-CCARGS +=
> > "-mstackrealign" while the second did not.  Could that be a reason
> > why bitbake complained about the base hash mismatch?
>
> By the time the binutils error happens, the error is kind of lost in
> history and must have been added some time prior to that.
>
> We know its a build of python-async for a core2-32 MACHINE. Did you
> also try building those with the "-mstackrealign" option? Were there
> any other changes you can think of that would have applied to the
> core2-32 MACHINE builds?

All the base OpenXT machines have "-mstackrealign" in their conf.  My
new 64bit machines do not have it.  I don't recall working with
core2-32 MACHINES at the time.  The new layer I pulled in only had a
layer.conf and a 64bit machine.conf.

In my second container, I `rm -rf cache/ tmp-glibc/ sstate-cache/`.
Running the build of the first OpenXT machine, bb_codeparse.dat gets
populated with python-async:
'3c6fe664c51d2f793f8fd0eb103d68cb': frozenset({'find', 'sed',
'install', 'mv', 'bbfatal_log', 'rmdir', '[', 'rm',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'test'})

python-async is not in tmp-glibc/work and `grep -r tmp-glibc/log`
doesn't turn up anything.  If I run `bitbake -g`, python-async doesn't
appear in any of the output files.  Is bb_codeparser.data getting
populated without building the recipe to be expected?

Regards,
Jason


More information about the Openembedded-core mailing list