[OE-core] [PATCH] oe_syslog.py: Handle syslogd/klogd restart race

Jon Mason jdmason at kudzu.us
Fri Jun 21 19:26:49 UTC 2019


On Fri, Jun 21, 2019 at 1:44 PM Jon Mason <jdmason at kudzu.us> wrote:
>
> On Fri, Jun 21, 2019 at 12:58 PM <richard.purdie at linuxfoundation.org> wrote:
> >
> > On Fri, 2019-06-21 at 12:39 -0400, Jon Mason wrote:
> > > On Fri, Jun 21, 2019 at 12:14 PM Richard Purdie
> > > <richard.purdie at linuxfoundation.org> wrote:
> > > > On Fri, 2019-06-21 at 11:42 -0400, Jon Mason wrote:
> > > > >
> > > > Thanks, I think this is reasonable however I think we may need to
> > > > make
> > > > the above a function and then call it from other places in the
> > > > tests in
> > > > that file.
> > > >
> > > > test_syslog_restart should check it did restart using the above
> > > >
> > > > test_syslog_startup_config does a second restart which we should
> > > > also
> > > > check?
> > >
> > > Seems reasonable.  I'll crank out v2 shortly.
> > >
> > > > Out of interest were you able to see error codes being returned in
> > > > status in your tests?
> > >
> > > I used code to force every error path during development, but not
> > > that
> > > the testcase will fail.  So, your question did cause me to notice a
> > > bug in the code when verifying that the old ones are no longer
> > > running.  That should return 0 if still running, which wouldn't cause
> > > the assert outside of the loop.  So, I'll need to tweak this there.
> > > v2 will have this fix as well.
> >
> > The reason I ask is that its far from clear that busybox's starts-stop-
> > daemon would notice if the daemon didn't restart so I don't think we
> > can reliably trust status to be set correctly.
> >
> > Is there any reason we can't run these checks regardless of status?
>
> The current code logic would work regardless of whether it failed or
> not.  I we can run it every time, and it would not hurt anything.
>
> > I realise there is slightly more overhead but it might give us more
> > chance of fixing all the races?
>
> 4 extra function calls would almost be statistical noise.  I'll code
> it up to do the check every time regardless and add it for each
> syslogd/klogd call.

A timed run of testimage on a Cortex A57x4 system (with KVM enabled)
went from ~1m21s (1m23s, 1m22s,1m20s) to ~1m23s (1m22s, 1m25s, 1m24s).
Without KVM enabled, it went from 20m6s to 20m56s. It's possible with
a larger sample size that they would converge even more, but I think
this is sufficient to show it's not a deal breaker to run it every
time.

Thanks,
Jon


>
> >
> > Cheers,
> >
> > Richard
> >


More information about the Openembedded-core mailing list