[OE-core] Quality of meta-oe metadata

Tue Apr 1 17:40:47 UTC 2014

On Tue, Apr 01, 2014 at 12:12:58PM -0500, Mark Hatle wrote:
> On 3/29/14, 8:31 PM, Martin Jansa wrote:
> > Hi, sorry for longer e-mail, this is one of topic I would like to discuss
> > on OEDAM (http://openembedded.org/wiki/OEDAM), but having some feedback and
> > thoughts in advance will be very useful.
> >
> > As people can notice from my "State of bitbake world" e-mails or
> > http://www.openembedded.org/wiki/Bitbake_World_Status
> > we never had "green" builds. There are always 20+ failed tasks in those
> > big builds and just reading the numbers isn't good indicator of quality,
> > because sooner you break something in dependency tree, fewer recipes will
> > be actually tested, so fewer failed tasks often means that something
> > important is broken.
> 
> ...
> 
> > 3) OE releases work great and don't invalidate sstate signatures so often, so my
> >     feeling is that most developers and projects are just using releases and
> >     less and less people do CI. People will start complaining that something
> >     is broken in meta-oe only when they are upgrading their project from 1.5 to
> >     1.6 when 1.6 is released and that could be too late for fixing meta-oe
> >     issues.
> 
> I agree, the success of what we're doing is certainly causing us 'different' 
> problems.  :)
> 
> > What I'm trying to do with it:
> >
> > a) sending those e-mails and updating wiki, so that people can easily find
> >     if some build failure is common or something which happens only for them
> >     (something like oestats-client.bbclass page was providing in oe-classic)
> >     It also includes log of QA issues which are usually easy to fix and great
> >     way for new people to learn something about OE.
> > b) trying to refuse all patches which cause new world issue (or new QA
> >     warn/err) - sometimes missed in logs, because it's often "hidden" by some
> >     other issue and hard to compare 40 issues from previous build with 38
> >     from current.
> >     Also the issues are often triggered later by changes somewhere else...
> > c) fixing build/qa issues in recipes I've never used or don't even have
> >     hardware to test - just based on assumption that something which builds
> >     is better than broken build, even when it can have some issues in runtime.
> > d) contacting people who added the recipe which is now failing, often
> >     without reply for months even when I try it multiple times :/
> 
> I agree with all of the above.  In fact I suspect you are going above and beyond 
> what you really need to.  Kudos for that BTW.
> 
> > e) moving to "nonworking" directory to mark it as "known-to-be-broken",
> >     last resort for recipes where the fix is complicated and it's not known
> >     if someone is actually using it (because it was broken for months and
> >     nobody replied).
> >     + easy to find them, because they are still in repository (instead of
> >       git rm + revert when someone fixes it)
> >     - layer index probably doesn't find them, because "nonworking" directory
> >       level isn't in BBFILES, so maybe meta-broken or meta-nonworking would be
> >       better
> >     ? some recipes are "broken" just because their dependency is broken, what
> >       to do with such recipe, I usually just say that in commit message when
> >       I'm moving them to "nonworking" with their broken dep.
> 
> Have you considered using the blacklist system for this?
> 
> You could do something like:
> 
> conf/layer.conf:
> include ${LAYERDIR}/conf/broken.inc
> 
> conf/broken.inc:
> 
> <can we ensure the blacklist system is in the system>
> 
> BROKENMSG_layername = "The recipe is disabled due to a build failure.  If you 
> need this recipe, or have gotten it to work.  Please submit patches to <path>. 
> Otherwise this recipe will be removed in the future."
> 
> # Recipe FOO is broken as of 2014-03-14, see ...
> PNBLACKLIST[FOO] = "${BROKENMSG_layername}"
> 
> # Recipe BAR is broken as of 2013-06-13, see ...
> PNBLACKLIST[BAR] = "${BROKENMSG_layername}"
> 
> 
> Then after a given amount of time, say one year? on the broken list -- we can 
> then remove the items.
> 
> If the format of the comments is such that it can be easily parsed, then we can 
> even automate tracking of these things.
> 
> (In cases where dependencies are causing the breakage, the message cause be 
> augmented with that information as well...)
> 
> The advantage of the blacklist system is that if a user tries to use the recipe 
> they will hopefully see the blacklist message, it prevents having to git mv 
> recipes, and should be easier for people to find/fix the bad code via a simple 
> patch.  (And hopefully easier to remove old cruft!)

Yes, that's another way of doing that and I was using it on world builds
as well (but without including it in layer and layer.conf to make it
"public")

e.g.
http://logs.nslu2-linux.org/buildlogs/oe/oe-shr-core-branches/log.world.20140329_001343.log/world_mask.inc

It definitely has the advantage that you can "document" it in the
message and few more details in the file itself.

Disadvantage from my POV was that I never included and enabled it in
repo, so new people didn't know about it and will still see the issues
when they try to build something broken.

Another disadvantage was that I always felt, OK I'll mark this as broken
with PNBLACKLIST and lets forget that it ever existed (sometimes I've
uncommented include lines for this just to confirm that everything still
fails - but not so often as "regular" builds).

And last one: if I recall correctly, when I was using this it was hard
to unblacklist something in your config, so if you wanted to test newer
version or something you had to modify world_mask.inc first, which won't
be very good for people if we include it by default.

Regards,

> --Mark
> 
> > What can we do better? How to motivate more people to do CI and send fixes?
> > When we get to "green" state it will be easier to quickly spot new issues and
> > easier to fix them, because it will be clear what's causing them.
> >
> >
> >
> 
> -- 
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core at lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core

-- 
Martin 'JaMa' Jansa     jabber: Martin.Jansa at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.openembedded.org/pipermail/openembedded-core/attachments/20140401/09a927e2/attachment-0002.sig>