Vc%csHdZddlZddlmZmZGddejZdS)aConvert graminit.[ch] spit out by pgen to Python code. Pgen is the Python parser generator. It is useful to quickly create a parser from a grammar file in Python's grammar notation. But I don't want my parsers to be written in C (yet), so I'm translating the parsing tables to Python data structures and writing a Python parse engine. Note that the token numbers are constants determined by the standard Python tokenizer. The standard token module defines these numbers and their names (the names are not used much). The token numbers are hardcoded into the Python tokenizer and into pgen. A Python implementation of the Python tokenizer is also available, in the standard tokenize module. On the other hand, symbol numbers (representing the grammar's non-terminals) are assigned by pgen based on the actual grammar input. Note: this module is pretty much obsolete; the pgen module generates equivalent grammar tables directly from the Grammar.txt input file without having to invoke the Python pgen C program. N)grammartokencs*eZdZdZdZdZdZdZdS) Convertera2Grammar subclass that reads classic pgen output files. The run() method reads the tables as produced by the pgen parser generator, typically contained in two C files, graminit.h and graminit.c. The other methods are for internal use only. See the base class for more documentation. cs|||||dS)z|r*t|d|d |\|\}}t|}||j|<||j|<d S) zParse the .h file written by pgen. (Internal) This file is a sequence of #define statements defining the nonterminals of the grammar as numbers. We build two tables mapping the numbers to names and back. Can't open : NFrz^#define\s+(\w+)\s+(\d+)$z(z): can't parse T) openOSErrorprintZ symbol2numberZ number2symbolrematchZstripgroupsint) rfilenameferrlinenolinemosymbolnumbers rrzConverter.parse_graminit_h5s/ XAA    E337 8 8 855555   4 4D aKF6==B 4$**,, 4(((FFF26**,,,@AAAA"$V.4"6*-3"6**ts <7<c s t|}n-#t$r }td|d|Yd}~dSd}~wwxYwd}|dzt|}}|dzt|}}|dzt|}}i}g}|drf|drt jd|}ttt| \} } } g} t| D]y} |dzt|}}t jd |}ttt| \}}| ||fz|dzt|}}| || | f<|dzt|}}|dt jd |}ttt| \}}g}t|D]} |dzt|}}t jd |}ttt| \} } } || | f} | | | ||dzt|}}|dzt|}}|df||_ i}t jd |}t|d}t|D]#}|dzt|}}t jd |}|d}ttt|dddd\}}}}||}|dzt|}}t jd|}i}t|d}t!|D]9\}}t#|}tdD]}|d|zzr d||dz|z<:||f||<%|dzt|}}||_g}|dzt|}}t jd|}t|d}t|D]}|dzt|}}t jd|}| \}}t|}|dkrd}nt|}| ||f|dzt|}}||_|dzt|}}|dzt|}}t jd|}t|d}|dzt|}}|dzt|}}t jd|}t|d}|dzt|}}t jd|}t|d} | |_|dzt|}} |dzt|}}dS#t*$rYdSwxYw)aParse the .c file written by pgen. (Internal) The file looks as follows. The first two lines are always this: #include "pgenheaders.h" #include "grammar.h" After that come four blocks: 1) one or more state definitions 2) a table defining dfas 3) a table defining labels 4) a struct defining the grammar A state definition has the following form: - one or more arc arrays, each of the form: static arc arcs__[] = { {, }, ... }; - followed by a state array, of the form: static state states_[] = { {, arcs__}, ... }; r r NFrr z static arc z)static arc arcs_(\d+)_(\d+)\[(\d+)\] = {$z\s+{(\d+), (\d+)},$z'static state states_(\d+)\[(\d+)\] = {$z\s+{(\d+), arcs_(\d+)_(\d+)},$zstatic dfa dfas\[(\d+)\] = {$z0\s+{(\d+), "(\w+)", (\d+), (\d+), states_(\d+),$iiiiz\s+("(?:\\\d\d\d)*")},$iz!static label labels\[(\d+)\] = {$z\s+{(\d+), (0|"\w+")},$Z0z \s+(\d+),$z\s+{(\d+), labels},$z \s+(\d+)$)rrrZnextZ startswithrrZlistZmaprrZrangeZappendstatesZgroupZeval enumerateZorddfaslabelsstartZ StopIteration)!rrrrrrZallarcsrrZnZmZkZarcsZ_ZiZjZsZtZstaterZndfasrrZxZyZzZfirstZ rawbitsetZcZbyter Znlabelsr!s! rrzConverter.parse_graminit_cTs8 XAA    E337 8 8 855555 axaaxaaxaoom,,! -//-00 1XJ"$$s3 44551aq((A#)!8T!WWDF"8$??BC 5 566DAqKKA''''%axa"&A%axa//-00 1 DdKKBC--..DAqE1XX # #%axaX?FFs3 44551aq!t} T"""" MM% !!8T!WWDF!!8T!WWDFCoom,,! -D  X6 = =BHHQKK  u * *A!!8T!WWDFM  BXXa[[F"3sBHHQ1a,@,@#A#ABBOFAq!1IE!!8T!WWDF4d;;BERXXa[[))I!),, + +11vvq++Aq!t}+)*acAg+"5>DLLaxa axa X:D A Abhhqkk""w " "A!!8T!WWDF4d;;B99;;DAqAACx GG MM1a& ! ! ! !axa axaaxa XmT * *BHHQKK  axaaxa X-t 4 4bhhqkk""axa XlD ) )BHHQKK   axa %!!8T!WWDFFF    DD s" <7<)Z?? [  [ csi|_i|_t|jD]1\}\}}|tjkr | ||j|<%| ||j|<2dS)z1Create additional useful structures. (Internal).N)ZkeywordsZtokensrr rZNAME)rZilabelZtypeZvalues rrzConverter.finish_offsv  %.t{%;%; + + !FMT5uz! +e +'- e$$ +$* D!  + +r N)Z__name__Z __module__Z __qualname____doc__r rrrr rrr$s^ >c%c%c%J+++++r r)r"rZpgen2rrZGrammarrr#r rr$st4 ! ]+]+]+]+]+]+]+]+]+]+r