C H A P T E R 3 - Elements of FCode Programming

C H A P T E R 3

Elements of FCode Programming

FCode is a computer programming language defined by IEEE Standard 1275-1994 Standard for Boot Firmware. FCode is semantically similar to ANS Forth, but is encoded as a sequence of binary byte codes representing a defined set of Forth definitions.

FCode has these characteristics:

The source format is machine and system independent.

The binary format (FCode) is machine, system, and position independent.

The binary format is compact.

The binary format can be interpreted easily and efficiently.

Programs are easy to develop and debug.

The source format can easily be translated to binary format.

The binary format can be translated back to source format.

Forth commands are called words, and are roughly analogous to procedures in other languages. Unlike other languages, such as C, which have operators, syntactic characters, and procedures, in Forth every word is a procedure.

A Forth word is named by a sequence of between one and 31 printable characters. A Forth program is written as a sequence of Forth word names separated by one or more white space characters (such as spaces, tabs, or line terminators). Forth uses a left-to-right reverse Polish notation, like some scientific calculators. The basic structure of Forth is: do this, now do that, now do something else, and so on.

New Forth words are defined as sequences of previously existing words. Subsequently, new words may be used to create still more words.

FCode is a byte-coded translation of a Forth program. Translating Forth source code to FCode involves replacing the Forth word names (stored as text strings) with their equivalent FCode numbers. The tokenized FCode takes up less space in PROM than the text form of the Forth program from which it was derived, and can be interpreted more easily and rapidly than the text form.

For purposes of this manual, the term FCode indicates both binary-coded FCode and the Forth programs written as ASCII text files for later conversion to binary-coded FCode.

Except where a distinction between the two forms is explicitly stated, the use of FCode in this manual can be assumed to apply equally to both FCode and Forth.

This chapter includes the following sections:

Colon Definitions

Stack Operations

Programming Style

Coding Style

A Minimal FCode Program

FCode Classes

Primitive FCode Functions

System FCode Functions

Interface FCode Functions

Local FCode Functions

Colon Definitions

Three concepts are critical to understanding FCode (or Forth):

A colon definition creates a new word with the same behavior as a sequence of existing words. A colon definition begins with a colon and ends with a semicolon.

Once a new word has been created, it is immediately available, either for direct execution or for use in future colon definitions.

Most parameter passing is done through a pushdown, last-in, first-out stack.

Normally, the action associated with an FCode function is performed when the FCode function is encountered. This is called interpret state. However, the state may switch between interpret state and compile state.

In interpret state, FCode functions are executed as they are encountered. Interpret state operates until encountering a ":". The word ":" does the following:

Allocates an FCode number and associates it with the name immediately following the colon

Switches to compile state

In compile state, FCodes are saved for later execution, rather than being executed immediately. The sequence thus compiled is installed in the action table as a new word, and can be used later in the same way as if it were a built-in word.

Compile state continues until a ";" is read. The word ";" does the following:

Compiles an end-of-procedure FCode word

Switches to interpret state

After compilation, the newly-assigned FCode word can be either interpreted or compiled as part of yet another new word.

If you define a new word having the same spelling as an existing word, the new definition supersedes the older one(s), but only for subsequent usages of that word.

Here's an example of a colon definition for a new FCode word dac!

: dac!  ( data offset -- )  dac + rl! ;

Stack Operations

Each FCode word is specified by its effect on the stack and any side effects, such as accessing memory. Many FCode words affect the stack, by removing arguments from it, performing some operation, and putting the result(s) back on the stack.

A stack comment, included in the colon definition, describes the effect on the stack of the execution of an FCode word.

In the previous example, the stack comment, beginning with "( " and ending with ")", shows that dac! takes two parameters from the stack, and doesn't replace them with anything when it's done.

Stack comments can be put anywhere in a colon definition. They should be included wherever their use will enhance the clarity of the definition.

The rightmost argument is on top of the stack, with any preceding arguments "under it." In other words, arguments are pushed onto the stack in left to right order, leaving the most recent argument (the rightmost one in the stack diagram) on the top.

In a stack diagram, parameters shown to the left of the double dashes are expected to be on the stack prior to the execution of the word. Parameters shown to the right of the double dashes are those which are left on the stack after execution of the word. Stack comments use the same convention but detail changes to the stack during execution of the word.

Stack comments and stack diagrams are essentially the same thing. Stack diagrams show the net effect to the stack of any Forth word. Stack comments are embedded in the definition of a word and are used to convey intermediate stack results or changes.

A series of words that describe the behavior of dac! follow the stack comment in the preceding example. Executing dac! is the same as executing the list of words in its colon definition.

Note that FCode words are separated by spaces, tabs, or newlines, so in the previous example, "( data" is not the same as "(data". Any visible character is part of a word and not a separator.

Programming Style

Some people have described Forth as a write-only language. While it sometimes ends up that way due to poorly-written or uncommented code, it is possible to write Forth (and FCode) programs that can be easily read and understood. Well-written Forth programs will meet these criteria. See Appendix C for detailed information about the style used in the existing OpenBoot FCode source base.

Although case is not significant, by convention FCode is written in lowercase.

Commenting Code

Comment code extravagantly, then consider adding more comments. The comments will help with maintenance of the code, and don't add to the final size of the resulting FCode PROM.

Adopt the useful convention of using "( )" for stack comments and "\" for other descriptive text and comments.

In comments, describe the purpose of the Forth words, any interface assumptions and requirements, and unusual aspects of the algorithm you used. Try to avoid simply translating low-level details of the code into English. Comments like, "increment the variable" are rarely helpful.

Coding Style

By studying the examples in this book, you can note the indentation and phrasing style that is widely used in OpenBoot source code. Adoption of this style will allow your Forth code to be read more easily by the many programmers who are accustomed to the style.

Definition Length

Keep word definitions short. If your definition exceeds half a page, it should be rewritten as two or more smaller definitions. This will help to make each definition more readable. Readable code is easier to maintain.

A good size for a word definition is one or two lines of code. Keeping definitions short and limited in functionality improves readability, speeds debugging and increases the likelihood that the word will be reusable. Remember, reuse of Forth words is a principal contributor to compact PROM images.

Stack Comments

Always include stack comments in word definitions. It can be useful to compare intended function with what the code really does. The following is an example of a word definition with acceptable style.

\ xyz-map  establishes a virtual-to-physical mapping for each of the

\ useful addressable regions on the board

0 value status-register

: xyz-map  ( -- )

\ Base-address Size create-mapping then save virtual address

   my-address 4 map-low                            ( virtaddr )

   to status-register                              ( )

   my-address 10.0000 d+ frame-buf-size map-low    ( virtaddr )

   to frame-buffer-adr                             ( )

Note the stack diagram following the word xyz-map, and the use of stack comments in the word's definition code.

Stack diagrams are generally written using descriptive parameter names to clarify correct usage. See the table below for stack parameter abbreviations used in this manual.

TABLE 3-1 Stack Parameter Abbreviations
Notation	Description
`\|`	Alternate stack results shown with space, for example (`input -- addr len false \| result true`).
`\|`	Alternate stack items shown without space, for example (`input -- addr len\|0 result`).
`???` or `?`	Unknown stack item(s)
`...`	Unknown stack item(s). If used on both sides of a stack comment, means the same stack items are present on both sides.
`< > <space>`	Space delimiter. Leading spaces are ignored.
`a-addr`	Variable-aligned address
`addr`	Memory address (generally a virtual address)
`addr len`	Address and length for memory region
`byte b`xxx	8-bit value (low order byte in a 32-bit word)
`char`	7-bit value (low order byte), high bit unspecified
`cnt` or `len` or `size`	Count or length
`d`xxx	Double (extended-precision) numbers. Two stack items, hi quadlet on top of stack
`<eol>`	End-of-line delimiter
`false`	0 (false flag)
`ihandle`	Pointer for an instance of a package
`n` or `n1` or `n2` or `n3 n`xxx	Normal signed values (32-bit)
`nu` or `nu1`	Signed or unsigned values (32-bit)
`<nothing>`	Zero stack items
`phandle`	Pointer for a package
`phys`	Physical address (actual hardware address)
`phys.hi`	Upper cell of physical address
`phys.lo`	Lower cell of physical address
`pstr`	Packed string
`quad` or `qxxx`	Quadlet (32-bit value)
`qaddr`	Quadlet (32-bit) aligned address
`{text}`	Optional text. Causes default behavior if omitted.
`"text<delim>"`	Input buffer text, parsed when command is executed. Text delimiter is enclosed in <>.
`true`	-1 (true flag)
`u`xxx	Unsigned value, positive values (32-bit)
`virt`	Virtual address (address used by software)
`waddr`	Doublet (16-bit) aligned address
`word` or `w`xxx	Doublet (16-bit value, low order two bytes in a 32-bit word)
`x` or `x1`	Arbitrary stack item
`x.lo x.hi`	Low/high significant bits of a data item
`xt`	Execution token
`xxx?`	Flag. Name indicates usage (for example, `done?` `ok?` `error?`).
`xyz-str xyz-len`	Address and length for unpacked string
`xyz-sys`	Control-flow stack items, implementation-dependent
`( C: -- )`	Compilation stack diagram
`( -- )` or `( E: -- )`	Execution stack diagram
`( R: -- )`	Return stack diagram

A Minimal FCode Program

If a peripheral bus card is not needed during the boot process, a minimal FCode program that merely declares the name of the device and the location and size of on-board registers will often suffice.

An example of a minimal program for an SBus device is:

fcode-version1

" SUNW,bison"  encode-string  " name" property

my-address h# 20.0000 + my-space encode-phys

h# 100 encode-int encode+

" reg" property

end0

Note the following about this SBus example:

my-address and my-space each leave only a single number on the stack representing the phys.lo phys.hi address representation of an SBus node. (The value of #address-cells is 2 for SBus which is reflected by this format.)

An offset of 0x200000 is being added to the value returned by my-address.

The size argument of reg is a single number (since #size-cells is 1 for SBus) reflecting the SBus 32-bit address space.

The example program above creates a name property that will be used to identify the device whose value is "SUNW,bison". Begin the name attribute's value with an identification of your company. The preferred form of this identification is an organizationally unique identifier (OUI), a sequence of six uppercase hexadecimal digits assigned by the IEEE Registration Authority Committee. OUIs are guaranteed to be unique worldwide. (For more information about obtaining an OUI, please see the glossary entry for name in IEEE Standard 1275-1994 Standard for Boot Firmware.)

As an alternative to the OUI, you may use a sequence of from one to five uppercase letters representing the stock symbol of your company on any stock exchange whose symbols do not conflict with the symbols of the New York Stock Exchange and the NASDAQ Exchange. All stock exchanges in the United States satisfy this requirement. If a non-U.S. company's stock is traded on U.S. stock exchanges by "depository equivalents," those symbols also satisfy this requirement.

The preceding SBus example program can also be written using the following shorthand form. The FCode program generated will be equivalent to the minimal SBus program given above.

fcode-version1

" SUNW,bison" name

my-address h# 20.0000 + my-space h# 100 reg

end0

FCode Classes

There are four general classes of FCode source words:

TABLE 3-2 FCode Source Word Classes
Class	Description
Primitives	These words generally correspond directly to conventional Forth words and implement functions such as addition, stack manipulation, and control structures.
System	These are extension words implemented in the boot PROMs and implement functions such as memory allocation and device property reporting.
Interface	These are specific to particular types of devices and implement functions such as `draw-character` for a display device.
Local	These are private words, implemented and used only by the device that created the definition.

Each FCode primitive is represented in a peripheral card's PROM as a single byte. Other FCodes are represented in the PROM as two consecutive bytes. The first byte, a value from 1 to 0x0f, may be thought of as an escape code.

One-byte FCode numbers range in value from 0x10 to 0xfe. Two-byte FCode numbers begin with a byte in the range 0x01 to 0x0f and end with a byte in the range 0x00 to 0xff. The single-byte values 0x00 and 0xff signify "end of program" (either value will do; conventionally, 0x00 is used).

Currently-defined FCodes are listed in functional groups, in alphabetic order by name, and in numeric order by FCode value in Appendix A.

Primitive FCode Functions

There are more than 300 primitive FCode functions, most of which exactly parallel ANS Forth words, divided into three groups:

FCode words that generate a single FCode byte

tokenizer macros

tokenizer directives

Primitive FCode functions that have an exact parallel with standard ANS Forth words are given the same name as the equivalent ANS Forth word. Chapter 14 contains further descriptions of primitive FCodes.

There are about another 70 tokenizer macros, most of which also have direct ANS Forth equivalents. These are convenient source code words translated by the tokenizer into short sequences of FCode primitives.

Tokenizer directives are words that generate no FCodes, but are used to control the interpretation process. Tokenizer directives include the following words:

decimal, hex, and octal

d#, h#, and o#

headers and headerless

\ and (

.(

alias

System FCode Functions

System FCode functions are used by all classes of FCode drivers for various system-related functions. System FCode functions can be either service words or configuration words.

Service words are available to the device's FCode driver when needed for functions such as memory mapping or diagnostic routines.

Configuration words are included in the driver to document characteristics of the driver itself. These properties are made available for use by the operating system.

Interface FCode Functions

Interface FCode functions are standard routines used by the workstation's CPU to perform the functions of the peripheral card's device. Different classes of devices will each use only the appropriate set of interface FCodes.

For example, in order to display a character on the screen, OpenBoot calls the interface FCode draw-character. Previously, the FCode driver for the device controlling that screen must have assigned a device-specific implementation to draw-character. It does this as follows:

: my-draw ( char -- )  \ "local" word to draw a character.

    ...                  \ Definition contents.

;                      \ end of my-draw definition.

: my-install ( -- )    \ local word to install all interfaces.

...

    ['] my-draw to draw-character

...

When my-install executes, draw-character is assigned the behavior of my-draw.

Local FCode Functions

Local FCode functions are assigned to words defined in the body of an FCode program. There are over 2000 FCode byte values allocated for local FCode functions. The byte values are meaningful only in the context of a particular driver. Different drivers reuse the same set of byte values.