Defining Words¶
The Pearl of Forth!
virginia.edu¶
Copied from: http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm#create
Michael Ham has called the word pair CREATE…DOES>, the “pearl of Forth”.
CREATE is a component of the compiler, whose function is to make a new dictionary entry with a given name.
DOES> specifies a run-time action for the “child” words of a defining word.
Note
The point of a create does> word is that when you execute it, it produces a child word, and when that child word executes, it executes some code.
Defining “Defining” Words¶
CREATE finds its most important use in extending the powerful class of Forth words called “defining” words. The colon compiler “:” is such a word, as are VARIABLE and CONSTANT.
The definition of VARIABLE in high-level Forth is simple
: VARIABLE CREATE 1 CELLS ALLOT ;
We have already seen how VARIABLE is used in a program. An alternative definition found in some Forths is the variables are initialized to 0.
: VARIABLE CREATE 0 , ;
Forth lets us define words initialized to contain specific values: for example, we might want to define the number 17 to be a word. CREATE and “,” (“comma”) can do this:
17 CREATE SEVENTEEN , <cr> ok
Now test it via
SEVENTEEN @ . <cr> 17 ok .
Remarks:¶
The word , (“comma”) puts TOS into the next cell of the dictionary and increments the dictionary pointer by that number of bytes.
A word “C,” (“see-comma”) exists also — it puts a character into the next character-length slot of the dictionary and increments the pointer by 1 such slot. (In the ASCII character representation the slots are 1 byte long; Unicode characters require 2 bytes.)
Run-time vs. compile-time actions¶
In the preceding example, we were able to initialize the variable SEVENTEEN to 17 when we CREATEd it, but we still have to fetch it to the stack via SEVENTEEN @ whenever we want it. This is not quite what we had in mind. We would like to find 17 in TOS when SEVENTEEN is named. The word DOES> gives us the tool to do this.
The function of DOES> is to specify a run-time action for the “child” words of a defining word. Consider the defining word CONSTANT , defined in high-level (of course CONSTANT is usually defined in machine code for speed) Forth by
: CONSTANT CREATE , DOES> @ ;
and used as
53 CONSTANT PRIME <cr> ok
Now test it:
PRIME . <cr> 53 ok .
What is happening here?
CREATE (hidden in CONSTANT) makes an entry named PRIME (the first word in the input stream following CONSTANT). Then “,” places the TOS (the number 53) in the next cell of the dictionary.
Then DOES> (inside CONSTANT) appends the actions of all words between it and “;” (the end of the definition) —in this case, “@”— to the child word(s) defined by CONSTANT.
Dimensioned data (intrinsic units)¶
Here is an example of the power of defining words and of the distinction between compile-time and run-time behaviors.
Physical problems generally involve quantities that have dimensions, usually expressed as mass (M), length (L) and time (T) or products of powers of these. Sometimes there is more than one system of units in common use to describe the same phenomena.
For example, U.S. or English police reporting accidents might use inches, feet and yards; while Continental police would use centimeters and meters. Rather than write different versions of an accident analysis program it is simpler to write one program and make unit conversions part of the grammar. This is easy in Forth.
The simplest method is to keep all internal lengths in millimeters, say, and convert as follows:
: INCHES 254 10 */ ;
: FEET [ 254 12 * ] LITERAL 10 */ ;
: YARDS [ 254 36 * ] LITERAL 10 */ ;
: CENTIMETERS 10 * ;
: METERS 1000 * ;
Note
This example is based on integer arithmetic. The word */ means “multiply the third number on the stack by NOS, keeping double precision, and divide by TOS”. That is, the stack comment for */ is ( a b c – a*b/c).
The usage would be
10 FEET . <cr> 3048 ok
The word “[” switches from compile mode to interpret mode while compiling. (If the system is interpreting it changes nothing.) The word “]” switches from interpret to compile mode.
Barring some error-checking, the “definition” of the colon compiler “:” is just
: : CREATE ] DOES> doLIST ;
and that of “;” is just
: ; next [ ; IMMEDIATE
Another use for these switches is to perform arithmetic at compile time rather than at run-time, both for program clarity and for easy modification, as we did in the first try at dimensioned data (that is, phrases such as
[ 254 12 * ] LITERAL
and
[ 254 36 * ] LITERAL
which allowed us to incorporate in a clear manner the number of tenths of millimeters in a foot or a yard.
The preceding method of dealing with units required unnecessarily many definitions and generated unnecessary code. A more compact approach uses a defining word, UNITS :
: D, ( hi lo --) SWAP , , ;
: D@ ( adr -- hi lo) DUP @ SWAP CELL+ @ ;
: UNITS CREATE D, DOES> D@ */ ;
Then we could make the table
254 10 UNITS INCHES
254 12 * 10 UNITS FEET
254 36 * 10 UNITS YARDS
10 1 UNITS CENTIMETERS
1000 1 UNITS METERS
\ Usage:
10 FEET . <cr> 3048 ok
3 METERS . <cr> 3000 ok
\ .......................
\ etc.
This is an improvement, but Forth permits a simple extension that allows conversion back to the input units, for use in output:
VARIABLE <AS> 0 <AS> !
: AS TRUE <AS> ! ;
: ~AS FALSE <AS> ! ;
: UNITS CREATE D, DOES> D@ <AS> @
IF SWAP THEN
*/ ~AS ;
\ UNIT DEFINITIONS REMAIN THE SAME.
\ Usage:
10 FEET . <cr> 3048 ok
3048 AS FEET . <cr> 10 ok
Advanced uses of the compiler¶
Suppose we have a series of push-buttons numbered 0-3, and a word WHAT to read them. That is, WHAT waits for input from a keypad: when button #3 is pushed, for example, WHAT leaves 3 on the stack.
We would like to define a word BUTTON to perform the action of pushing the n’th button, so we could just say:
WHAT BUTTON
In a conventional language BUTTON would look something like
: BUTTON DUP 0 = IF RING DROP EXIT THEN
DUP 1 = IF OPEN DROP EXIT THEN
DUP 2 = IF LAUGH DROP EXIT THEN
DUP 3 = IF CRY DROP EXIT THEN
ABORT" WRONG BUTTON!" ;
That is, we would have to go through two decisions on the average.
Forth makes possible a much neater algorithm, involving a “jump table”. The mechanism by which Forth executes a subroutine is to feed its “execution token” (often an address, but not necessarily) to the word EXECUTE. If we have a table of execution tokens we need only look up the one corresponding to an index (offset into the table) fetch it to the stack and say EXECUTE.
One way to code this is
CREATE BUTTONS ' RING , ' OPEN , ' LAUGH , ' CRY ,
: BUTTON ( nth --) 0 MAX 3 MIN
CELLS BUTTONS + @ EXECUTE ;
Note how the phrase “0 MAX 3 MIN” protects against an out-of-range index. Although the Forth philosophy is not to slow the code with unnecessary error checking (because words are checked as they are defined), when programming a user interface some form of error handling is vital. It is usually easier to prevent errors as we just did, than to provide for recovery after they are made.
How does the action-table method work?
CREATE BUTTONS makes a dictionary entry BUTTONS.
The word ‘ (“tick”) finds the execution token (xt) of the following word, and the word , (“comma”) stores it in the data field of the new word BUTTONS. This is repeated until all the subroutines we want to select among have their xt’s stored in the table.
The table BUTTONS now contains xt’s of the various actions of BUTTON.
CELLS then multiplies the index by the appropriate number of bytes per cell, to get the offset into the table BUTTONS of the desired xt.
BUTTONS + then adds the base address of BUTTONS to get the abso- lute address where the xt is stored.
@ fetches the xt for EXECUTE to execute.
EXECUTE then executes the word corresponding to the button pushed.
Simple!
If a program needs but one action table the preceding method suffices. However, more complex programs may require many such. In that case it may pay to set up a system for defining action tables, including both error-preventing code and the code that executes the proper choice. One way to code this is
: ;CASE ; \ do-nothing word
: CASE:
CREATE HERE -1 >R 0 , \ place for length
BEGIN BL WORD FIND \ get next subroutine
0= IF CR COUNT TYPE ." not found" ABORT THEN
R> 1+ >R
DUP , ['] ;CASE =
UNTIL R> 1- SWAP ! \ store length
DOES> DUP @ ROT ( -- base_adr len n)
MIN 0 MAX \ truncate index
CELLS + CELL+ @ EXECUTE ;
Note the two forms of error checking. At compile-time, CASE: aborts compilation of the new word if we ask it to point to an undefined subroutine:
case: test1 DUP SWAP X ;case
X not found
and we count how many subroutines are in the table (including the do-nothing one, ;case) so that we can force the index to lie in the range [0,n].
CASE: TEST * / + - ;CASE ok
15 3 0 TEST . 45 ok
15 3 1 TEST . 5 ok
15 3 2 TEST . 18 ok
15 3 3 TEST . 12 ok
15 3 4 TEST . . 3 15 ok
Just for a change of pace, here is another way to do it:
: jtab: ( Nmax --) \ starts compilation
CREATE \ make a new dictionary entry
1- , \ store Nmax-1 in its body
; \ for bounds clipping
: get_xt ( n base_adr -- xt_addr)
DUP @ ( -- n base_adr Nmax-1)
ROT ( -- base_adr Nmax-1 n)
MIN 0 MAX \ bounds-clip for safety
1+ CELLS+ ( -- xt_addr = base + 1_cell + offset)
;
: | ' , ; \ get an xt and store it in next cell
: ;jtab DOES> ( n base_adr --) \ ends compilation
get_xt @ EXECUTE \ get token and execute it
; \ appends table lookup & execute code
\ Example:
: Snickers ." It's a Snickers Bar!" ; \ stub for test
\ more stubs
5 jtab: CandyMachine
| Snickers
| Payday
| M&Ms
| Hershey
| AlmondJoy
;jtab
3 CandyMachine It's a Hershey Bar! ok
1 CandyMachine It's a Payday! ok
7 CandyMachine It's an Almond Joy! ok
0 CandyMachine It's a Snickers Bar! ok
-1 CandyMachine It's a Snickers Bar! ok
forth.org¶
Copied from http://forth.org/svfig/Len/definwds.htm
It has been said that one does not write a program in Forth. Rather, one extends Forth to make a new language specifically designed for the application at hand.
An important part of this process is the DEFINING WORD, by which it is possible to combine a data structure with an action to create multiple instances that differ only in detail.
The basics of create … does>¶
Defining words are based on the Forth construct create … does>, which beginners can apply mechanically. The steps are:
Start a colon definition
Write create
Follow by words that lay down data or allot RAM, thus creating the body
Write does>
Follow by words that act on the body.
These steps are fairly simple, but understanding them is complex because there are three stages in the action of a defining word.
An example¶
Our example will be indexed-array, which allots an area of RAM. At run time, it takes an index, i, and returns the address of the ith cell. If i=0 the address of the first cell is returned because Forth conventionally starts numbering at 0.
Stage 1¶
: indexed-array ( n -- ) ( i -- a )
create cells allot
does> swap cells +
;
Stage 2¶
20 indexed-array foo \ Make a 1-dimensional array with 20 cells
Stage 3¶
3 foo \ Put addr of fourth element on the stack
Stage 1: Compiling the defining word¶
The first phase is in effect during the compilation of indexed-array, that is, between the colon and semicolon. The colon sets up a header. Then, execution tokens of ordinary Forth words are laid down, while Immediate Words are executed at once. The process is terminated by the semicolon.
The only Immediate word in indexed-array is “does>”. It lays down code that will act later in stage 2.
Stage 2: Creating a “child”¶
The second phase is in effect when “indexed-array” is used to create “foo”.
create sets up a header
cells allot reserves n cells, forming the “data field” (formerly called “body”) of foo.
- The code that was laid down by “does>” now comes into action. It changes the execution of “foo” so that it will:
Put the address of its data field on the stack, and then
Execute the Forth words between “does>” and the semicolon.
Stage 3: Executing the child¶
In the third phase, we execute “foo”.
i is already on the stack, and the origin of the data field is put on top of that
“swap” rearranges the stack
“cells” multiplies i by the cell length
“+” adds the result to the origin of the data field.
Warning
Important issues such as range checking and multi-dimensional arrays are not discussed here.
Note
Why is there a right angle-bracket in “does>”? It originated in early Forths in which create was followed by “<builds … does>”. Later, the action of “<builds” was incorporated in “create”, but the spelling of “does>” was not changed.
Simple Example with LOOP¶
: gen-shc CREATE 5 0 DO 1 . LOOP DOES> @ ; ok.
gen-shc myc 1 1 1 1 1 ok.