lburg.1 (4780B)
1 .TH LBURG 1 "local \- 11/30/94" 2 .\" $Id: lburg.1 145 2001-10-17 21:53:10Z timo $ 3 .SH NAME 4 lburg \- lcc's code-generator generator 5 .SH SYNOPSIS 6 .B lburg 7 [ 8 .I option 9 ]... 10 [ [ 11 .I input 12 ] 13 .I output 14 ] 15 .br 16 .SH DESCRIPTION 17 .PP 18 .I lburg 19 reads an lcc-style BURG specification from 20 .I input 21 and writes a pattern-matching code generator to 22 .IR output . 23 If 24 .I input 25 is `\-' or is omitted, 26 .I lburg 27 reads the standard input; 28 If 29 .I output 30 is `\-' or is omitted, 31 .I lburg 32 writes to the standard output. 33 .PP 34 .I lburg 35 accepts specifications that conform to the following EBNF grammar. 36 Terminals are enclosed in single quotes or are 37 given in uppercase, all other symbols are nonterminals or English phrases, 38 {X} denotes zero or more instances of X, and [X] denotes an optional X. 39 .PP 40 .nf 41 .RS 42 .ft CW 43 spec: `%{' configuration `%}' { dcl } `%%' { rule } 44 [ `%%' C code ] 45 46 dcl: `%start' nonterm 47 `%term' { ID `=' INT } 48 49 rule: nonterm `:' tree template [ C expression ] 50 51 tree: term `(' tree `,' tree `)' 52 term `(' tree `)' 53 term 54 nonterm 55 56 nonterm: ID 57 58 template: `"' { any character except double quote } `"' 59 .RE 60 .fi 61 .PP 62 Specifications are structurally similar to 63 .IR yacc 's. 64 Text between 65 `\f(CW%{\fP' 66 and 67 `\f(CW%}\fP' 68 is called the configuration section; there may be several such segments. 69 All are concatenated and copied verbatim into the head of the output. 70 Text after the second 71 `\f(CW%%\fP', 72 if any, is also copied verbatim into the output, at the end. 73 .PP 74 Specifications consist of declarations, a 75 `\f(CW%%\fP' 76 separator, and rules. 77 Input is line-oriented; each declaration and rule must appear on a separate line, 78 and declarations must begin in column 1. 79 Declarations declare terminals \(em the operators in subject 80 trees \(em and associate a unique, positive external symbol 81 number with each one. 82 Nonterminals are declared by their presence 83 on the left side of rules. The 84 \f(CW%start\fP 85 declaration optionally declares a nonterminal as the start symbol. 86 In the grammar above, 87 \f(CWterm\fP 88 and 89 \f(CWnonterm\fP 90 denote identifiers that are terminals and nonterminals. 91 .PP 92 Rules define tree patterns in a fully parenthesized prefix 93 form. Every nonterminal denotes a tree. 94 Each operator has a fixed 95 arity, which is inferred from the rules in which it is used. 96 A chain rule is a rule whose pattern is another nonterminal. 97 If no start symbol is declared, the nonterminal defined by the first rule is used. 98 .PP 99 Each rule ends with an expression that computes the cost of matching 100 that rule; omitted costs 101 default to zero. Costs of chain rules must be constants. 102 .PP 103 The configuration section configures the output 104 for the trees being parsed and the client's environment. 105 As shown, this section must define 106 \f(CWNODEPTR_TYPE\fP 107 to be a visible typedef symbol for a pointer to a 108 node in the subject tree. 109 The labeller invokes 110 \f(CWOP_LABEL(p)\fP, 111 \f(CWLEFT\_CHILD(p)\fP, and 112 \f(CWRIGHT\_CHILD(p)\fP 113 to read the operator and children from the node pointed to by \f(CWp\fP. 114 If the configuration section defines these operations as macros, they are implemented in-line; 115 otherwise, they must be implemented as functions. 116 .PP 117 The matcher 118 computes and stores a single integral state in each node of the subject tree. 119 The configuration section must define a macro 120 \f(CWSTATE_LABEL(p)\fP 121 to access the state field of the node pointed to 122 by \f(CWp\fP. It must be large enough to hold a pointer, and 123 a macro is required because it is used as an lvalue. 124 .PP 125 .SH OPTIONS 126 .TP 127 .BI \-p \ prefix 128 .br 129 .ns 130 .TP 131 .BI \-p prefix 132 Use 133 .I prefix 134 as the disambiquating prefix for visible names and fields. 135 The default is `\f(CW_\fP'. 136 .TP 137 .B \-T 138 Arrange for 139 .sp 140 .nf 141 .ft CW 142 void _trace(NODEPTR_TYPE p, int eruleno, 143 int cost, int bestcost); 144 .sp 145 .fi 146 .ft R 147 to be called at each successful match. 148 \f(CWp\fP 149 identifies the node and 150 \f(CWeruleno\fP 151 identifies the matching rule; the rules are numbered 152 beginning at 1 in the order they appear in the input. 153 \f(CWcost\fP 154 is the cost of the match and 155 \f(CWbestcost\fP 156 is the cost of the best previous match. The current match 157 wins only if 158 \f(CWcost\fP 159 is less than \f(CWbestcost\fP. 160 32767 represents the infinite cost of no previous match. 161 \f(CW_trace\fP must be declared in the configuration section. 162 .SH "SEE ALSO" 163 .IR lcc (1) 164 .PP 165 C. W. Fraser and D. R. Hanson, 166 .IR A Retargetable C Compiler: Design and Implementation , 167 Benjamin/Cummings, Redwood City, CA, 1995, 168 ISBN 0-8053-1670-1. Chapter 14. 169 .PP 170 C. W. Fraser, D. R. Hanson and T. A. Proebsting, 171 `Engineering a simple, efficient code generator generator,' 172 .I 173 ACM Letters on Programming Languages and Systems 174 .BR 1 , 175 3 (Sep. 1992), 213-226. 176 .br 177 .SH BUGS 178 Mail bug reports along with the shortest input 179 that exposes them to drh@cs.princeton.edu.