man pages section 1: User Commands

Exit Print View

Updated: July 2014

states (1)


states - awk alike text processing tool


states  [-hvV]  [-D  var=val]  [-f file] [-o outputfile] [-p
path] [-s startstate] [-W level] [filename ...]


STATES                                                  STATES(1)

     states - awk alike text processing tool

     states  [-hvV]  [-D  var=val]  [-f file] [-o outputfile] [-p
     path] [-s startstate] [-W level] [filename ...]

     States is an awk-alike text processing tool with some  state
     machine  extensions.  It is designed for program source code
     highlighting and to similar tasks  where  state  information
     helps input processing.

     At  a  single  point  of  time, States is in one state, each
     quite similar to awk's work environment, they  have  regular
     expressions  which  are  matched  from the input and actions
     which are executed when a match is found.  From  the  action
     blocks, states can perform state transitions; it can move to
     another state from which the processing is continued.  State
     transitions are recorded so states can return to the calling
     state once the current state has finished.

     The biggest difference between states and awk, besides state
     machine extensions, is that states is not line-oriented.  It
     matches regular expression tokens from the input and once  a
     match is processed, it continues processing from the current
     position, not from the beginning of the next input line.

     -D var=val, --define=var=val
             Define variable var to have string value val.   Com-
             mand line definitions overwrite variable definitions
             found from the config file.

     -f file, --file=file
             Read  state  definitions  from  file  file.   As   a
             default, states tries to read state definitions from
             file in the current working directory.

     -h, --help
             Print short help message and exit.

     -o file, --output=file
             Save output to file file instead of printing  it  to

     -p path, --path=path
             Set  the  load path to path.  The load path defaults
             to the directory, from which the  state  definitions

STATES              Last change: Oct 23, 1998                   1

STATES                                                  STATES(1)

             file is loaded.

     -s state, --state=state
             Start  execution  from state state.  This definition
             overwrites  start  state  resolved  from  the  start

     -v, --verbose
             Increase the program verbosity.

     -V, --version
             Print states version and exit.

     -W level, --warning=level
             Set the warning level to level.  Possible values for
             level are:

             light   light warnings (default)

             all     all warnings

     States program files can contain on start block,  startrules
     and  namerules  blocks  to  specify the initial state, state
     definitions and expressions.

     The start block is the main() of the states program,  it  is
     executed  on  script  startup for each input file and it can
     perform any initialization the script  needs.   It  normally
     also  calls  the  check_startrules()  and  check_namerules()
     primitives which resolve the initial state  from  the  input
     file  name  or the data found from the begining of the input
     file.  Here is a sample start block  which  initializes  two
     variables and does the standard start state resolving:

            a = 1;
            msg = "Hello, world!";
            check_startrules ();
            check_namerules ();

     Once  the  start block is processed, the input processing is
     continued from the initial state.

     The initial state is resolved by the information found  from
     the  startrules  and  namerules blocks.  Both blocks contain
     regular expression - symbol pairs, when the regular  expres-
     sion  is  matched from the name of from the beginning of the
     input file, the initial state is named by the  corresponding

STATES              Last change: Oct 23, 1998                   2

STATES                                                  STATES(1)

     symbol.  For example, the following start and name rules can
     distinguish C and Fortran files:

            /.(c|h)$/    c;
            /.[fF]$/     fortran;

            /- [cC] -/      c;
            /- fortran -/   fortran;

     If these rules are used  with  the  previously  shown  start
     block,  states  first check the beginning of input file.  If
     it has string -*- c -*-, the file is assumed  to  contain  C
     code  and the processing is started from state called c.  If
     the beginning of the input file has string -*- fortran  -*-,
     the  initial  state  is fortran.  If none of the start rules
     matched, the name of the input  file  is  matched  with  the
     namerules.   If  the  name  ends  to suffix c or C, we go to
     state c.  If the suffix is f or F, the initial state is for-

     If  both  start  and  name rules failed to resolve the start
     state, states just copies its input to output unmodified.

     The start state can also be specified from the command  line
     with option -s, --state.

     State definitions have the following syntax:

     state { expr {statements} ... }

     where  expr  is: a regular expression, special expression or
     symbol and statements is a list  of  statements.   When  the
     expression  expr  is  matched  from the input, the statement
     block is executed.  The statement  block  can  call  states'
     primitives,  user-defined  subroutines,  call  other states,
     etc.  Once the block is executed, the  input  processing  is
     continued from the current intput position (which might have
     been changed if the statement block called other states).

     Special expressions BEGIN and END can be used in  the  place
     of  expr.   Expression  BEGIN  matches  the beginning of the
     state, its block  is  called  when  the  state  is  entered.
     Expression  END  matches  the end of the state, its block is
     executed when states leaves the state.

     If expr is a symbol, its value is looked up from the  global

STATES              Last change: Oct 23, 1998                   3

STATES                                                  STATES(1)

     environment and if it is a regular expression, it is matched
     to the input, otherwise that rule is ignored.

     The states program file can also have top-level expressions,
     they  are  evaluated  after  the  program file is parsed but
     before any input files are processed or the start  block  is

     call (symbol)
             Move  to  state  symbol and continue input file pro-
             cessing from that state.  Function returns  whatever
             the  symbol  state's  terminating  return  statement

     calln (name)
             Like call but the argument name is evaluated and its
             value  must  be  string.  For example, this function
             can be used to call a state which name is stored  to
             a variable.

     check_namerules ()
             Try  to  resolve  start  state from namerules rules.
             Function returns 1 if start state was resolved or  0

     check_startrules ()
             Try  to  resolve  start state from startrules rules.
             Function returns 1 if start state was resolved or  0

     concat (str, ...)
             Concanate  argument  strings  and return result as a
             new string.

     float (any)
             Convert argument to a floating point number.

     getenv (str)
             Get value of environment variable str.   Returns  an
             empty string if variable var is undefined.

     int (any)
             Convert argument to an integer number.

     length (item, ...)
             Count the length of argument strings or lists.

     list (any, ...)
             Create a new list which contains items any, ...

STATES              Last change: Oct 23, 1998                   4

STATES                                                  STATES(1)

     panic (any, ...)
             Report  a non-recoverable error and exit with status
             1.  Function never returns.

     print (any, ...)
             Convert arguments to strings and print them  to  the

     range (source, start, end)
             Return  a sub-range of source starting from position
             start (inclusively) to end (exclusively).   Argument
             source can be string or list.

     regexp (string)
             Convert string string to a new regular expression.

     regexp_syntax (char, syntax)
             Modify  regular  expression  character  syntaxes  by
             assigning new  syntax  syntax  for  character  char.
             Possible values for syntax are:

             'w'     character is a word constituent

             ' '     character isn't a word constituent

     regmatch (string, regexp)
             Check  if  string  string matches regular expression
             regexp.  Functions returns a boolean success  status
             and sets sub-expression registers $n.

     regsub (string, regexp, subst)
             Search  regular expression regexp from string string
             and  replace  the  matching  substring  with  string
             subst.  Returns the resulting string.  The substitu-
             tion string subst can contain $n references  to  the
             n:th parenthesized sup-expression.

     regsuball (string, regexp, subst)
             Like  regsub  but  replace  all  matches  of regular
             expression regexp from  string  string  with  string

     require_state (symbol)
             Check  that  the  state  symbol  is defined.  If the
             required state is undefined, the function  tries  to
             autoload it.  If the loading fails, the program will
             terminate with an error message.

     split (regexp, string)
             Split string string to list considering  matches  of
             regular rexpression regexp as item separator.

STATES              Last change: Oct 23, 1998                   5

STATES                                                  STATES(1)

     sprintf (fmt, ...)
             Format  arguments according to fmt and return result
             as a string.

     strcmp (str1, str2)
             Perform a  case-sensitive  comparision  for  strings
             str1 and str2.  Function returns a value that is:

             -1      string str1 is less than str2

             0       strings are equal

             1       string str1 is greater than str2

     string (any)
             Convert argument to string.

     strncmp (str1, str2, num)
             Perform  a  case-sensitive  comparision  for strings
             str1 and str2 comparing at maximum num characters.

     substring (str, start, end)
             Return a substring of string str starting from posi-
             tion start (inclusively) to end (exclusively).

     $.      current input line number

     $n      the   n:th  parenthesized  regular  expression  sub-
             expression from the latest state regular  expression
             or from the regmatch primitive

     $`      everything  before  the matched regular rexpression.
             This is usable when used with  the  regmatch  primi-
             tive;  the  contents  of  this variable is undefined
             when used in action blocks to refer the data  before
             the block's regular expression.

     $B      an alias for $`

     argv    list of input file names

             name of the current input file

     program name of the program (usually states)

     version program version string

STATES              Last change: Oct 23, 1998                   6

STATES                                                  STATES(1)

     /usr/share/enscript/hl/*.st             enscript's states definitions

     See   attributes(5)   for   descriptions  of  the  following

     |Availability   | print/filter/enscript |
     |Stability      | Uncommitted           |
     awk(1), enscript(1)

     Markku Rossi <> <>

     GNU Enscript  WWW  home  page:  <

     Further  information about this software can be found on the
     open source community  website  at

STATES              Last change: Oct 23, 1998                   7