Programming Utilities Guide

Actions

After the scanner recognizes a string matching the regular expression at the start of a rule, it looks to the right of the rule for the action to be performed. You supply the actions.

Kinds of actions include recording the token type found and its value, if any; replacing one token with another; and counting the number of instances of a token or token type. You write these actions as program fragments in C.

An action can consist of as many statements as are needed. You might want to change the text in some way or print a message noting that the text has been found. So, to recognize the expression Amelia Earhart and to note such recognition, apply the rule:

"Amelia Earhart" printf("found Amelia"); 

To replace lengthy medical terms in a text with their equivalent acronyms, a rule such as the following would work:

Electroencephalogram printf("EEG"); 

To count the lines in a text, you recognize the ends of lines and increment a line counter.

lex uses the standard C escape sequences, including \n for newline. So, to count lines you might have the following syntax, where lineno, like other C variables, is declared in the "Definitions " section.

\n	 lineno++; 

Input is ignored when the C language null statement, a colon ;, is specified. So the following rule causes blanks, tabs, and new-lines to be ignored:

[ \t\n] ; 

The alternation operator | can also be used to indicate that the action for a rule is the action for the next rule. The previous example could have been written with the same result:

" "	| 
\t
\n ;

The scanner stores text that matches an expression in a character array called yytext[]. You can print or manipulate the contents of this array as you like. In fact, lex provides a macro called ECHO that is equivalent to printf ("%s", yytext).

When your action consists of a long C statement, or two or more C statements, you might write it on several lines. To inform lex that the action is for one rule only, enclose the C code in braces.

For example, to count the total number of all digit strings in an input text, print the running total of the number of digit strings, and print out each one as soon as it is found, your lex code might be:

\+?[1-9]+						{ digstrngcount++; 
					            	printf("%d",digstrngcount); 
					            	printf("%s", yytext); }

This specification matches digit strings whether or not they are preceded by a plus sign because the ? indicates that the preceding plus sign is optional. In addition, it catches negative digit strings because that portion following the minus sign matches the specification.