Programming Utilities Guide

yacc Input Syntax

This section has a description of the yacc input syntax as a yacc specification. Context dependencies and so forth are not considered. Although yacc accepts an LALR(1) grammar, the yacc input specification language is specified as an LR(2) grammar; the difficulty arises when an identifier is seen in a rule immediately following an action.

If this identifier is followed by a colon, it is the start of the next rule; otherwise, it is a continuation of the current rule, which just happens to have an action embedded in it. As implemented, the lexical analyzer looks ahead after seeing an identifier and figures out whether the next token (skipping blanks, newlines, comments, and so on) is a colon. If so, it returns the token C_IDENTIFIER.

Otherwise, it returns IDENTIFIER. Literals (quoted strings) are also returned as IDENTIFIERs but never as part of C_IDENTIFIERs.

						/* grammar for the input to yacc */ 

  						/* basic entries */ 
%token IDENTIFIER				/* includes identifiers and literals */ 

%token C_IDENTIFIER			/* identifier (but not literal) */
                     		/* followed by a : */ 

%token NUMBER				  	/* [0-9]+ */

    			/* reserved words: %type=>TYPE %left=>LEFT,etc.  */ 

%token LEFT RIGHT NONASSOC TOKEN PREC TYPE START UNION 

%token MARK			     		/* the %% mark */ 

%token LCURL			 		/* the %{ mark */ 

%token RCURL					/* the %) mark */ 

    				/* ASCII character literals stand for themselves */ 

%token spec t

%% 

spec   		: defs MARK rules tail 
          	; 
tail	   	: MARK 
         	{ 
          			In this action,read in the rest of the file 
         	} 
       		|	     	/* empty: the second MARK is optional */ 
       		; 
defs	   	:	      		/* empty */ 
       		| defs def 
       		; 
def	    	: START IDENTIFIER 
       		| UNION 
       		{ 
            	Copy union definition to output 
         	} 
         	| LCURL 
       		{ 
          		Copy C code to output file 
       		} 
       		RCURL 
           | rword tag nlist
         		;
rword    : TOKEN 
       		| LEFT 
       		| RIGHT 
       		| NONASSOC 
         	| TYPE 
       		; 
tag	  	:		    		/* empty: union tag is optional */ 
       		| '<' IDENTIFIER '>' 
       		; 
nlist	  	: nmno 
       		| nlist nmno 
       		| nlist ',' nmno 
       		; 
nmno	   : IDENTIFIER	/* Note: literal illegal with % type */
 	       	| IDENTIFIER NUMBER		/* Note: illegal with % type */  
         	; 

                 			/* rule section */ 
rules	  	: C_IDENTIFIER rbody prec 
       		| rules rule 
       		; 
rule		   : C_IDENTIFIER rbody prec
       		| '|' rbody prec 
       		; 
rbody	  	:        /* empty */ 
       		| rbody IDENTIFIER 
       		| rbody act 
       		; 
act		: '{' 
       		{ 
            	Copy action translate $$ etc.  
       		} 
       		'}' 
       		; 
prec	   	:        /* empty */ 
         	| PREC IDENTIFIER 
       		| PREC IDENTIFIER act 
         	| prec ';' 
       		;