[OE-core] [PATCH 0/7] kernel-yocto: conslidated pull request

Richard Purdie richard.purdie at linuxfoundation.org
Tue Sep 5 14:59:59 UTC 2017


On Tue, 2017-09-05 at 10:24 -0400, Bruce Ashfield wrote:
> On 09/05/2017 10:13 AM, Richard Purdie wrote:
> > 
> > Hi Bruce,
> > 
> > We had a locked up qemuppc lsb image and I was able to find
> > backtraces
> > from the serial console log (/home/pokybuild/yocto-
> > autobuilder/yocto-
> > worker/nightly-ppc-lsb/build/build/tmp/work/qemuppc-poky-
> > linux/core-
> > image-lsb/1.0-r0/target_logs/dmesg_output.log in case anyone ever
> > needs
> > to find that). The log is below, this one is for the 4.9 kernel.
> > 
> > Failure as seen on the AB:
> > https://autobuilder.yoctoproject.org/main/builders/nightly-ppc-lsb/
> > buil
> > ds/1189/steps/Running%20Sanity%20Tests/logs/stdio
> > 
> > Not sure what it means, perhaps you can make more sense of it? :)
> Very interesting.
> 
> I'm (un)fortunately familiar with RCU issues, and obviously, this is
> only happening under load. There's clearly a driver issue as it
> interacts with whatever is running in userspace.
> 
>  From the log, it looks like this is running over NFS and pinning the
> CPU and the qemu ethernet isn't handling it gracefully.

Looking at the logs I've seen I don't think this is over NFS, it should
be over virtio:

"Kernel command line: root=/dev/vda"

> But exactly what it is, I can't say from that trace. I'll try and do
> a cpu-pinned test on qemuppc (over NFS) and see if I can trigger the
> same trace.

I'm also not sure what this might be. I did a bit more staring at the
log and I think the system did come back:

NOTE: core-image-lsb-1.0-r0 do_testimage:   test_dnf_install_from_disk (dnf.DnfRepoTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... OK (249.929s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_dnf_install_from_http (dnf.DnfRepoTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... OK (212.547s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_dnf_reinstall (dnf.DnfRepoTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... FAIL (1501.682s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_dnf_repoinfo (dnf.DnfRepoTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... FAIL (15.952s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_syslog_running (oe_syslog.SyslogTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... FAIL (3.039s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_syslog_logger (oe_syslog.SyslogTestConfig)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... SKIP (0.001s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_syslog_restart (oe_syslog.SyslogTestConfig)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... SKIP (0.001s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_syslog_startup_config (oe_syslog.SyslogTestConfig)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... SKIP (0.001s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_pam (pam.PamBasicTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... FAIL (3.003s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_parselogs (parselogs.ParseLogsTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... OK (39.675s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_rpm_help (rpm.RpmBasicTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... OK (2.590s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_rpm_query (rpm.RpmBasicTest)
NOTE: core-image-lsb-1.0-r0 do_testimage:  ... OK (2.295s)
NOTE: core-image-lsb-1.0-r0 do_testimage:   test_rpm_instal

So for a while there the system "locked up":

AssertionError: 255 != 0 : dnf --repofrompath=oe-testimage-repo-noarch,http://192.168.7.1:38838/noarch --repofrompath=oe-testimage-repo-qemuppc,http://192.168.7.1:38838/qemuppc --repofrompath=oe-testimage-repo-ppc7400,http://192.168.7.1:38838/ppc7400 --nogpgcheck reinstall -y run-postinsts-dev

Process killed - no output for 1500 seconds. Total running time: 1501 seconds.

AssertionError: 255 != 0 : dnf --repofrompath=oe-testimage-repo-noarch,http://192.168.7.1:38838/noarch --repofrompath=oe-testimage-repo-qemuppc,http://192.168.7.1:38838/qemuppc --repofrompath=oe-testimage-repo-ppc7400,http://192.168.7.1:38838/ppc7400 --nogpgcheck repoinfo
ssh: connect to host 192.168.7.2 port 22: No route to host

self.assertEqual(status, 1, msg = msg)
AssertionError: 255 != 1 : login command does not work as expected. Status and output:255 and ssh: connect to host 192.168.7.2 port 22: No route to host

then the system seems to have come back. All very odd...

Cheers,

Richard




More information about the Openembedded-core mailing list