[OE-core] [PATCH] oe_syslog.py: Handle syslogd/klogd restart race

Fri Jun 21 16:39:14 UTC 2019

On Fri, Jun 21, 2019 at 12:14 PM Richard Purdie
<richard.purdie at linuxfoundation.org> wrote:
>
> On Fri, 2019-06-21 at 11:42 -0400, Jon Mason wrote:
> > syslogd and klogd can occasionally take too long to restart, which
> > causes tests to fail by starting before the log daemons are ready.  To
> > work around this problem, poll for up to 30 seconds on the processes to
> > verify the old ones are killed and the new ones are up and running.
> >
> > [YOCTO #13379]
> >
> > Signed-off-by: Jon Mason <jdmason at kudzu.us>
> > ---
> >  meta/lib/oeqa/runtime/cases/oe_syslog.py | 37 ++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/meta/lib/oeqa/runtime/cases/oe_syslog.py b/meta/lib/oeqa/runtime/cases/oe_syslog.py
> > index 0f5f9f43ca..3270a0fc88 100644
> > --- a/meta/lib/oeqa/runtime/cases/oe_syslog.py
> > +++ b/meta/lib/oeqa/runtime/cases/oe_syslog.py
> > @@ -50,9 +50,46 @@ class SyslogTestConfig(OERuntimeTestCase):
> >      @skipIfDataVar('VIRTUAL-RUNTIME_init_manager', 'systemd',
> >                     'Not appropiate for systemd image')
> >      def test_syslog_startup_config(self):
> > +        status, syslogd_pid = self.target.run('pidof syslogd')
> > +        status, klogd_pid = self.target.run('pidof klogd')
> > +
> >          cmd = 'echo "LOGFILE=/var/log/test" >> /etc/syslog-startup.conf'
> >          self.target.run(cmd)
> >          status, output = self.target.run('/etc/init.d/syslog restart')
> > +
> > +        # Error, most likely a race between shutting down and starting up
> > +        if status:
> > +            import time
> > +            timeout = time.time() + 30
> > +
> > +            while time.time() < timeout:
> > +                time.sleep(1)
> > +                # Verify the old ones are no longer running
> > +                status, output = self.target.run('kill -0 %s' %syslogd_pid)
> > +                if not status:
> > +                    self.logger.debug("old syslogd is running")
> > +                    continue
> > +
> > +                status, output = self.target.run('kill -0 %s' %klogd_pid)
> > +                if not status:
> > +                    self.logger.debug("old klogd is running")
> > +                    continue
> > +
> > +                # Verify the new ones are running
> > +                status, new_syslogd_pid = self.target.run('pidof syslogd')
> > +                if status:
> > +                    self.logger.debug("new syslogd is not running")
> > +                    continue
> > +
> > +                status, new_klogd_pid = self.target.run('pidof klogd')
> > +                if status:
> > +                    self.logger.debug("new syslogd is not running")
> > +                    continue
> > +
> > +                # Everything is fine now, so keep running
> > +                status = 0
> > +                break
> > +
> >          msg = ('Could not restart syslog service. Status and output:'
> >                 ' %s and %s' % (status,output))
> >          self.assertEqual(status, 0, msg)
>
> Thanks, I think this is reasonable however I think we may need to make
> the above a function and then call it from other places in the tests in
> that file.
>
> test_syslog_restart should check it did restart using the above
>
> test_syslog_startup_config does a second restart which we should also
> check?

Seems reasonable.  I'll crank out v2 shortly.

> Out of interest were you able to see error codes being returned in
> status in your tests?

I used code to force every error path during development, but not that
the testcase will fail.  So, your question did cause me to notice a
bug in the code when verifying that the old ones are no longer
running.  That should return 0 if still running, which wouldn't cause
the assert outside of the loop.  So, I'll need to tweak this there.
v2 will have this fix as well.

Thanks,
Jon

>
> Cheers,
>
> Richard
>