[OE-core] [RFC PATCH] Add gnu testsuite execution for OEQA

Tue Jul 9 10:43:39 UTC 2019

On Sat, 6 Jul 2019 at 22:52, Richard Purdie
<richard.purdie at linuxfoundation.org> wrote:
>
> On Sat, 2019-07-06 at 11:39 +0000, Nathan Rossi wrote:
> > This patch is an RFC for adding support to execute the gnu test suites for
> > binutils, gcc and glibc. With the intention for enabling automated test running
> > of these test suites within the OEQA framework such that they can be executed by
> > the Yocto Autobuilder.
> >
> > Please note that this patch is not a complete implementation and needs
> > additional work as well as changes based on comments and feedback from this RFC.
>
> This is rather cool, thanks!
>
> Looking at this was on my todo list once we got the existing OEQA,
> ptest and ltp setups working well. I'm very happy to have been beaten
> to it though.
>
> > The test suites covered need significant resources or build artifacts such
> > that running them on the target is undesirable which rules out the use of ptest.
> > Because of this the test suites can be run on the build host and if necessary
> > call out to the target.
> >
> > The following implementation creates a number of recipes that are used to
> > build/execute the test suites for the different components. The reason for
> > creating separate recipes is primarily due to dependencies and the need for
> > components in the sysroot. For example binutils has tests that use the C
> > compiler however binutils is a dependency for the C compiler and thus would
> > cause a dependency loop. The issue with sysroots occurs with dependence on
> > `*-initial` recipes and the test suites needing the non-initial version.
>
> I think this means you're working with something pre-warrior as we got
> rid of most of the *-initial recipes apart from libgcc-initial.

I have been working against master (maybe a few days old). However I
hit the sysroot collision in gcc-cross with what I thought was a
-initial recipe. So I split it out and kept moving ahead.

Turns out it was just one file, specifically an empty limits.h that is
created by the gcc-cross recipe itself.
(http://git.openembedded.org/openembedded-core/tree/meta/recipes-devtools/gcc/gcc-cross.inc#n49)

>
> > Some issues with splitting the recipes:
> >  - Rebuilds the recipe
> >    - Like gcc-cross-testsuite in this patch, could use a stashed builddir
> >  - Source is duplicated
> >    - gcc gets around this with shared source
> >  - Requires having the recipe files and maintaining them
> >    - Multiple versions of recipes
> >    - Multiple variants of recipes (-cross, -crosssdk, -native if desired)
>
> It might be possible to have multiple tasks in these recipes and have
> the later tasks depend on other pieces of the system like the C
> compiler, thereby avoiding the need for splitting if only the later
> tasks have the dependencies. Not sure if it would work or not but may
> be worth exploring.

Initially I had started out with binutils having the do_check task
which depended on the populate_sysroot task of the associated
dependencies and this was working well for binutils at least. The only
concern I had with this was whether tainting the recipe sysroots would
be problematic?

Given the sysroot/-initial issue with gcc was not a dependency problem
I have tried with the check tasks in gcc-cross and gcc-runtime and
there does not appear to have any issues. So splitting the recipes for
binutils, gcc-cross and gcc-runtime is not necessary.

For glibc the sysroot has libgcc-initial, so the sysroot collision is
still a problem for it.

>
> > Target execution is another issue with the test suites. Note that binutils
> > however does not require any target execution. In this patch both
> > qemu-linux-user and ssh target execution solutions are provided. For the
> > purposes of OE, qemu-linux-user may suffice as it has great success at executing
> > gcc and gcc-runtime tests with acceptable success at executing the glibc tests.
>
> I feel fairly strongly that we probably want to execute these kinds of
> tests under qemu system mode, not the user mode. The reason is that we
> want to be as close to the target environment as we can be and that
> qemu-user testing is at least as much of a test of qemu's emulation
> that it is the behaviour of the compiler or libc (libc in particular).
> I was thinking this and then later read you confirmed my suspicions
> below...
>
> > The glibc test suite can be problematic to execute for a few reasons:
> >  - Requires access to the exact same filesystem as the build host
> >    - On physical targets and QEMU this requires NFS mounts
>
> We do have unfs support already under qemu which might make this
> possible.

unfs works great and I was using it for testing out the ssh support.
However I did notice that is does rely on the host having rpcbind
installed. This does prevent running the tests without root (even if
using slirp for qemu).

>
> >  - Relies on exact syscall behaviour
> >    - Causes some issues where there are differences between qemu-linux-user and
> >      the target architectures kernel
>
> Right, this one worries me and pushes me to want to use qemu system
> mode.
>
> >  - Can consume significant resources (e.g. OOM, or worse trigger bugs/panics in
> >    kernel drivers)
>
> Any rough guide to what significant is here? ptest needs 1GB memory for
> example. qemu-system mode should limit that to the VMs at least?

This is a tricky one, and my comment is mostly based on prior
experience with running glibc tests on less common targets (like
microblaze). Where the stability of the kernel when handling large
amount of NFS traffic alongside cpu/memory heavy tasks would cause
panics and some times kernel hangs. I am not sure how generally this
applies though, will report on this once I have some tests results
with more of the qemu machines running via system emulation/ssh.

>
> >  - Slow to execute
> >    - With QEMU system emulation it can take many hours
>
> We do have KVM acceleration for x86 and arm FWIW which is probably
> where we'd start testing this on the autobuilder.
>
> >    - With some physical target architectures it can take days (e.g. microblaze)
> >
> > The significantly increased execution speed of qemu-linux-user vs qemu system
> > with glibc, and the ability for qemu-linux-user to be executed in parallel with
> > the gcc test suite makes it a strong solution for continuous integration
> > testing.
>
> Was that with or without KVM?

Without KVM. I will add KVM testing with the system emulation tests I
run, to get a better idea on exact performance differences.

>
> > The following table shows results for the major test suite components running
> > with qemu-linux-user execution. The numbers represent 'failed tests'/'total
> > tests'. The machines used to run the tests are the `qemu*` machine for the
> > associated architecture, not all qemu machines available in oe-core were tested.
> > It is important to note that these results are only indicative of
> > qemu-linux-user behaviour and that there are a number of test failures that are
> > due to issues not specific to qemu-linux-user.
> >
> >         | gcc          | g++          | libstdc++   | binutils    | gas         | ld          | glibc
> > x86-64  |   589/135169 |   457/131913 |     1/13008 |     0/  236 |     0/ 1256 |   166/ 1975 |  1423/ 5991
> > arm     |   469/123905 |   365/128416 |    19/12788 |     0/  191 |     0/  872 |   155/ 1479 |    64/ 5130
> > aarch64 |   460/130904 |   364/128977 |     1/12789 |     0/  190 |     0/  442 |   157/ 1474 |    76/ 5882
> > powerpc | 18336/116624 |  6747/128636 |    33/12996 |     0/  187 |     1/  265 |   157/ 1352 |  1218/ 5110
> > mips64  |  1174/134744 |   401/130195 |    22/12780 |     0/  213 |    43/ 7245 |   803/ 1634 |  2032/ 5847
> > riscv64 |   456/106399 |   376/128427 |     1/12748 |     0/  185 |     0/  257 |   152/ 1062 |    88/ 5847
>
> I'd be interested to know how these numbers compare to the ssh
> execution...

I will setup some OEQA automation to get results for this. It might
take me a little bit since they can be slow to run.

>
> The binutils results look good! :)
>
> > This patch also introduces some OEQA test cases which cover running the test
> > suites. However in this specific patch it does not include any implementation
> > for the automated setup of qemu system emulation testing with runqemu and NFS
> > mounting for glibc tests. Also not included in these test cases is any known
> > test failure filtering.
>
> The known test failure filtering is something we can use the OEQA
> backend for, I'd envisage this being intergrated in a similar way to
> the way we added ptest/ltp/ltp-posix there.

I am not very familiar with the OEQA backend, and was not able to find
anything about test filtering in lib/oeqa (just searching). Maybe I am
looking in the wrong place, any pointers for where this part of the
backend is located?

Thanks,
Nathan

>
> > I would also be interested in the opinion with regards to whether these test
> > suites should be executed as part of the existing Yocto Autobuilder instance.
>
> Short answer is yes. We won't run them all the time but when it makes
> sense and I'd happily see the autobuilder apart to be able to trigger
> these appropriately. We can probably run the KVM accelerated arches
> more often than the others.
>
> Plenty of implementation details to further discuss but this is great
> to see!
>
> Cheers,
>
> Richard
>