[oe] Bitbake parser work

Tue May 19 12:26:58 UTC 2009

Hey,

some brief notes about the horror of the current bitbake parser. I don't know 
what state of the art parsers do, I don't have a book on compiler design but I 
would be surprised if they work like bitbake.

What is so horrible about our parser?

	- It is line based, using python file.readline
	- Each line gets matched against several regexps
	- Then some state to only match a subset of regexps
	- It is parsing the same files all over again(*)
	- Every regexp that got matched is directly converted to
	  calls into bb.data.setVar*.

What have I done this time:
	1.) In the function that match a line of input, move all bb.data.set*
	     to methods
	2.) move these methods to ast.py
	3.) convert the methods to create nodes for our abstract syntax list
	4.) immediately evaluate these nodes...
	5.) kill the data parameter from feeder (line based matching) and
            evaluate all nodes after they have been parsed
	6.) on top of these I'm able to implement a cache to not reparse
	    files I have seen...

What are known problems:
       classes/package.bbclass messes with internals of the parser
       and needs to be changed.

	I still search for a good cache class in python, currently all .bbclass and 
       .inc are cached.

Where can it be found:
	http://page.mi.fu-berlin.de/~freyther/bitbake/parser

What other tricks can we play:
	- We should ply to get the lexing time down... 
	- Or cache the parsed statement list..

obviously please review the patches, fix OE, get it merged :)

z.

(*) Every file doing inherit autotools, pkgconfig will lead to parsing these two 
classes again