[oe] Bitbake parser work
Holger Freyther
zecke at selfish.org
Tue May 19 12:26:58 UTC 2009
Hey,
some brief notes about the horror of the current bitbake parser. I don't know
what state of the art parsers do, I don't have a book on compiler design but I
would be surprised if they work like bitbake.
What is so horrible about our parser?
- It is line based, using python file.readline
- Each line gets matched against several regexps
- Then some state to only match a subset of regexps
- It is parsing the same files all over again(*)
- Every regexp that got matched is directly converted to
calls into bb.data.setVar*.
What have I done this time:
1.) In the function that match a line of input, move all bb.data.set*
to methods
2.) move these methods to ast.py
3.) convert the methods to create nodes for our abstract syntax list
4.) immediately evaluate these nodes...
5.) kill the data parameter from feeder (line based matching) and
evaluate all nodes after they have been parsed
6.) on top of these I'm able to implement a cache to not reparse
files I have seen...
What are known problems:
classes/package.bbclass messes with internals of the parser
and needs to be changed.
I still search for a good cache class in python, currently all .bbclass and
.inc are cached.
Where can it be found:
http://page.mi.fu-berlin.de/~freyther/bitbake/parser
What other tricks can we play:
- We should ply to get the lexing time down...
- Or cache the parsed statement list..
obviously please review the patches, fix OE, get it merged :)
z.
(*) Every file doing inherit autotools, pkgconfig will lead to parsing these two
classes again
More information about the Openembedded-devel
mailing list