[OE-core] [PATCH 1/2] systemd: test target boot time against unit timeout

Brian Avery avery.brian at gmail.com
Fri Oct 23 16:23:59 UTC 2015


On Fri, Oct 23, 2015 at 8:57 AM, Benjamin Esquivel
<benjamin.esquivel at linux.intel.com> wrote:
> This reminds me I haven't been able to parse your logs to check your
> data. If what you say is true, that the failure is presented when the
> system (kernel + userspace) booted in less than 90 secs, then I can
> make 2 observations:
>
> 1. your data and Randy's diverge, Randy's data has every systemd
> failure tied to a timeout message and a boot time greater than the
> unit's timeout. I would happily investigate further, if you send me
> your logs.
I'll stick them in dropbox and send you a link sometime today. they are large.

iirc, Randy saved the failures; I saved the failures and successes.
My main point with respect to the time data was that a number of the successes
also had log boot times.

-b

> 2. my test would not be really as effective as I think to catch the
> condition.
>
> Things to take into account:
>
> A. I'm using Randy's data to compare against yours because you both
> used the same method. I have my data that I obtained with a different
> method that supports my point too.
> B. (Randy/Brian)'s method and Benjamin's method threw data that they 3
> assume is the condition that made the autobuilder's fail, while I don't
> say they are wrong, this is not data coming out of the actual failures
> of the autobuilder. Having that said, I would not disprove your
> nice/ionice change nor adding this test to check if it can be a single
> trap for the systemd errors which right now have too many faces.
>
> Benjamin
>
> On Thu, 2015-10-22 at 23:10 +0000, Avery, Brian wrote:
>> The problem with this is one of the runs I did had a lot of systemd
>> timeout failures but the stats were like:
>>
>> Fails:
>> min startup time: 54 sec
>> avg startup time: 114 sec
>> max startup time: 260 sec
>>
>> Successes:
>> min startup time: 46 sec
>> avg startup time: 110 sec
>> max startup time: 266 sec
>>
>> so I¹m not sure that the time is a good indicator.
>>
>> These numbers are pulled from the jounalctl line in the
>> ssh_target_log
>> file that looks like:
>>
>> Oct 21 11:19:40 qemux86 systemd[1]: Startup finished in 42.255s
>> (kernel) +
>> 3min 10.075s (userspace) = 3min 52.331s.
>>
>>
>> -b
>>
>>
>>
>> On 10/22/15, 3:45 PM, "Benjamin Esquivel"
>> <benjamin.esquivel at linux.intel.com> wrote:
>>
>> > the new test:
>> > oeqa.runtime.systemd.SystemdJournalTests.test_systemd_boot_time
>> > will query the target boot time from journactl and will
>> > compare it against the default systemd's unit start timeout:
>> > (TimeoutStartSec)
>> >
>> > if the boot time is greater than the timeout, it will fail
>> >
>> > the reason is because once the boot time exceeds the unit start
>> > timeout then it is uncertain that the systemd units are able to
>> > properly come up and function
>> >
>> > when this test fails, the rest of the systemd units (services)
>> > should be
>> > skipped
>> >
>> > this test also prints the startup time in the test log like:
>> > ...
>> > test_systemd_boot_time (oeqa.runtime.systemd.SystemdJournalTests)
>> > ...
>> > Startup finished in 6.922s (kernel) + 52.089s (userspace) =
>> > 59.011s.
>> > ...
>> >
>> > [YOCTO#8141]
>> >
>> > Signed-off-by: Benjamin Esquivel <benjamin.esquivel at linux.intel.com
>> > >
>> > ---
>> > meta/lib/oeqa/runtime/systemd.py | 73
>> > ++++++++++++++++++++++++++++++++++++++++
>> > 1 file changed, 73 insertions(+)
>> >
>> > diff --git a/meta/lib/oeqa/runtime/systemd.py
>> > b/meta/lib/oeqa/runtime/systemd.py
>> > index c74394c..251d06e 100644
>> > --- a/meta/lib/oeqa/runtime/systemd.py
>> > +++ b/meta/lib/oeqa/runtime/systemd.py
>> > @@ -21,6 +21,33 @@ class SystemdTest(oeRuntimeTest):
>> >         self.assertEqual(status, expected, message)
>> >         return output
>> >
>> > +    def journalctl(self, args='',l_match_units=[]):
>> > +        """
>> > +        Request for the journalctl output to the current target
>> > system
>> > +
>> > +        Arguments:
>> > +        -args, an optional argument pass through argument
>> > +        -l_match_units, an optional list of units to filter the
>> > output
>> > +        Returns:
>> > +        -string output of the journalctl command
>> > +        Raises:
>> > +        -AssertionError, on remote commands that fail
>> > +        -ValueError, on a journalctl call with filtering by
>> > l_match_units that
>> > +        returned no entries
>> > +        """
>> > +        query_units=""
>> > +        if len(l_match_units):
>> > +            query_units = ['_SYSTEMD_UNIT='+unit for unit in
>> > l_match_units]
>> > +            query_units = " ".join(query_units)
>> > +        command = 'journalctl %s %s' %(args, query_units)
>> > +        status, output = self.target.run(command)
>> > +        if status:
>> > +            raise AssertionError("Command '%s' returned non-zero
>> > exit \
>> > +                    code %d:\n%s" % (command, status, output))
>> > +        if len(output) == 1 and "-- No entries --" in output:
>> > +            raise ValueError("List of units to match: %s, returned
>> > no
>> > entries"
>> > +                    % l_match_units)
>> > +        return output
>> >
>> > class SystemdBasicTests(SystemdTest):
>> >
>> > @@ -99,3 +126,49 @@ class SystemdJournalTests(SystemdTest):
>> >     def test_systemd_journal(self):
>> >         (status, output) = self.target.run('journalctl')
>> >         self.assertEqual(status, 0, output)
>> > +
>> > +    @skipUnlessPassed('test_systemd_basic')
>> > +    def test_systemd_boot_time(self, systemd_TimeoutStartSec=90):
>> > +        """
>> > +        Compare the target boot time from journalctl against
>> > TimeoutStartSec
>> > +
>> > +        Arguments:
>> > +        -systemd_TimeoutStartSec, an optional argument containing
>> > systemd's
>> > +        unit start timeout
>> > +        """
>> > +
>> > +        expr_items=["Startup finished","kernel",
>> > "userspace","\.$"]
>> > +        try:
>> > +            output = self.journalctl(args="-o cat --reverse")
>> > +        except AssertionError:
>> > +            self.fail("Error occurred while calling journalctl")
>> > +        if not len(output):
>> > +            self.fail("Error: unable to obtain the startup time
>> > from\
>> > +                    systemd journal")
>> > +
>> > +        # check for the regular expression items that match the
>> > startup
>> > time
>> > +        for line in output.split('\n'):
>> > +            check_match =
>> > "".join(re.findall(".*".join(expr_items),
>> > line))
>> > +            if check_match: break
>> > +        # put the startup time in the test log
>> > +        if check_match:
>> > +            print "\n%s" % check_match
>> > +        else:
>> > +            self.fail("Error while obtaining the boot time from
>> > journalctl")
>> > +        boot_time_sec = 0
>> > +
>> > +        # get the numeric values from the string and convert them
>> > to
>> > seconds
>> > +        # same data will be placed in list and string for
>> > manipulation
>> > +        l_boot_time = check_match.split(" ")[-2:]
>> > +        s_boot_time = " ".join(l_boot_time)
>> > +        # Obtain the minutes it took to boot
>> > +        if l_boot_time[0].endswith('min') and
>> > l_boot_time[0][0].isdigit():
>> > +            boot_time_min = s_boot_time.split("min")[0]
>> > +            # convert to seconds and accumulate it
>> > +            boot_time_sec += int(boot_time_min) * 60
>> > +        # Obtain the seconds it took to boot and accumulate
>> > +        boot_time_sec += float(l_boot_time[1].split("s")[0])
>> > +        #Assert the target boot time against systemd's unit start
>> > timeout
>> > +        msg = "Target boot time %s exceeds systemd's
>> > TimeoutStartSec %s"\
>> > +                %(boot_time_sec, systemd_TimeoutStartSec)
>> > +        self.assertTrue(boot_time_sec < systemd_TimeoutStartSec,
>> > msg =
>> > msg)
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core at lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core



More information about the Openembedded-core mailing list