[OE-core] [PATCH v2] pseudo: Upgrade to latest to fix openat() with a directory symlink [NAK]

Randy MacLeod randy.macleod at windriver.com
Wed Aug 14 16:02:42 UTC 2019


On 8/6/19 2:51 AM, Martin Jansa wrote:
> This is the same reproducer I am using in:
> https://bugzilla.yoctoproject.org/show_bug.cgi?id=12434
> but with this SRCREV I haven't reproduced it yet in first 500 
> iterations, so it's definitely improving for me (used to reproduce it at 
> least once in first 500 iterations)
> 
> Now I'm testing the reproducer with "qmake -install qinstall".

Any update Martin?



Using a variation of Juro's script and adding a little stress-ng load,
it _seems_ that I can make the problem happen more quickly than without
system stress but it's a shared system so _seems_ is underlined.

Using stress-ng was supposed to be a quick check to see if I could
get the reproducer down to minutes rather than around an hour.

Results are promising so I'll continue to use this approach as
I add debugging to pseudo and add an inline, immediate check in
the context of:
 
http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-core/glibc/glibc-locale.inc?h=master#n72
to see if the UID/GID are equal to my UID/GID.

Test runs summaries are below.

../Randy



cat src/distro/yocto/b/uid-diff/glibc-locale-stress
#!/bin/bash

fname='glibc-locale_master_august13'
max=100
for (( i=1; i <= $max; i++ ))
do
     echo "$i/$max  ${fname}_$i.log"
     bitbake glibc-locale -c cleanall 2>&1 > /dev/null
     # add some stress
     stress-ng -t 1000 --switch 8 --switch-freq 50000 &
     bitbake glibc-locale 2>&1 > ${fname}_$i.log
     # Destress
     killall -9 stress-ng
     if grep -q "host-user-contaminated" ${fname}_$i.log; then
         echo "error !"
       exit 2
     #else
       #rm ${fname}_$i.log
     fi
done


On a (shared) system where lscpu shows 128 cores
and no stress:

Trial   Iteration Error
1       44
2       19


stress-ng -t 1000 --switch 8 --switch-freq 50000

50000 was just the frequency that generated a high enough
but not too high load. On this systems, each process used ~30% of a cpu.

Trial   Iteration Error
1       3
2       18


stress-ng -t 1000 --switch 16 --switch-freq 50000

Trial   Iteration Error
1       3
2       1
3       11

stress-ng -t 1000 --switch 32 --switch-freq 50000

Trial   Iteration Error
1       2
2       9
3       8


stress-ng -t 1000 --switch 64 --switch-freq 50000

Trial   Iteration Error
1       4
2       13
3       >6


stress-ng -t 1000 --mq 64
  128 processes using 98% cpu each

Trial   Iteration Error
1       14
2       NaN

Trial 2 was precluded by other users of the shared system complaining!
The idea was to cause more rapid context switches. Later, I might try
this again with say 16 workers. If anyone has a better idea, please
reply.

EOM

> 
> Regards,
> 
> On Tue, Aug 6, 2019 at 12:43 AM Bystricky, Juro 
> <juro.bystricky at intel.com <mailto:juro.bystricky at intel.com>> wrote:
> 
>     I can reproduce the problem fairly easily  (and, sadly even with the
>     latest commits as 060058bb29f70b244e685b3c704eb0641b736f73 ).
>     In my case, it seems easy to reproduce if I have 40+ threads running.
>     The reproducer script (below) fails typically within the first 10
>     iterations.
> 
> 
>     #!/bin/bash
> 
>     fname='glibc-locale_master_august8'
>     max=1000
>     for (( i=1; i <= $max; i++ ))
>     do
>          echo "$i/$max  ${fname}_$i.log"
>          bitbake glibc-locale -c cleanall 2>&1 > /dev/null
>          bitbake glibc-locale 2>&1 > ${fname}_$i.log
>           if grep -q "host-user-contaminated" ${fname}_$i.log; then
>              echo "error !"
>            exit 2
>          #else
>            #rm ${fname}_$i.log
>          fi
> 
>     done
> 
>     ________________________________________
>     From: openembedded-core-bounces at lists.openembedded.org
>     <mailto:openembedded-core-bounces at lists.openembedded.org>
>     [openembedded-core-bounces at lists.openembedded.org
>     <mailto:openembedded-core-bounces at lists.openembedded.org>] on behalf
>     of Seebs [seebs at seebs.net <mailto:seebs at seebs.net>]
>     Sent: Saturday, August 03, 2019 7:23 AM
>     To: Khem Raj
>     Cc: openembedded-core at lists.openembedded.org
>     <mailto:openembedded-core at lists.openembedded.org>
>     Subject: Re: [OE-core] [PATCH v2] pseudo: Upgrade to latest to fix
>     openat() with a directory symlink [NAK]
> 
>     On Sat, 3 Aug 2019 05:33:46 -0700
>     Khem Raj <raj.khem at gmail.com <mailto:raj.khem at gmail.com>> wrote:
> 
>      > Will this fix the file ownership issue that we see with Glibc-locale
>      > packages from time to time?
> 
>     I have no idea. Since I haven't got a reliable reproducer for it, I
>     can't test it in a sane way.
> 
>     -s
>     --
>     _______________________________________________
>     Openembedded-core mailing list
>     Openembedded-core at lists.openembedded.org
>     <mailto:Openembedded-core at lists.openembedded.org>
>     http://lists.openembedded.org/mailman/listinfo/openembedded-core
>     -- 
>     _______________________________________________
>     Openembedded-core mailing list
>     Openembedded-core at lists.openembedded.org
>     <mailto:Openembedded-core at lists.openembedded.org>
>     http://lists.openembedded.org/mailman/listinfo/openembedded-core
> 
> 


-- 
# Randy MacLeod
# Wind River Linux


More information about the Openembedded-core mailing list