Programming Utilities Guide

Generating a Lexical Analyzer Program

lex generates a C-language scanner from a source specification that you write. This specification contains a list of rules indicating sequences of characters -- expressions -- to be searched for in an input text, and the actions to take when an expression is found. To see how to write a lex specification see the section "Writing lex Source ".

The C source code for the lexical analyzer is generated when you enter

$ lex lex.l

where lex.l is the file containing your lex specification. (The name lex.l is conventionally the favorite, but you can use whatever name you want. Keep in mind, though, that the .l suffix is a convention recognized by other system tools, make in particular.) The source code is written to an output file called lex.yy.c by default. That file contains the definition of a function called yylex() that returns 1 whenever an expression you have specified is found in the input text, 0 when end of file is encountered. Each call to yylex() parses one token (assuming a return); when yylex() is called again, it picks up where it left off.

Note that running lex on a specification that is spread across several files, as in the following example, produces one lex.yy.c:

$ lex lex1.l lex2.l lex3.l

Invoking lex with the -t option causes it to write its output to stdout rather than lex.yy.c, so that it can be redirected:

$ lex -t lex.l > lex.c

Options to lex must appear between the command name and the filename argument.

The lexical analyzer code stored in lex.yy.c (or the .c file to which it was redirected) must be compiled to generate the executable object program, or scanner, that performs the lexical analysis of an input text.

The lex library supplies a default main() that calls the function yylex(), so you need not supply your own main(). The library is accessed by invoking the -ll option to cc:

$ cc lex.yy.c -ll

Alternatively, you might want to write your own driver. The following is similar to the library version:

extern int yylex(); 

int yywrap()
{
	    return(1); 
}

main()
{
     while (yylex());
         ;
}

For more information about the function yywrap(), see the "Writing lex Source " section. When your driver file is compiled with lex.yy.c, as in the following example, its main() will call yylex() at run time exactly as if the lex library had been loaded:

$ cc lex.yy.c driver.c

The resulting executable file reads stdin and writes its output to stdout. Figure 2-1 shows how lex works.

Figure 2-1 Creation and Use of a Lexical Analyzer with lex

Graphic