Chapter 9

P21Forth Asssembler


P21Forth 1.02 Assembler

The P21Forth system offer the programmer the ability to write executable routines in the ANS compliant high level compiler. This is done using colon and other high level defining words. Alternately P21Forth also offers the programmer a built in Forth assembler for the MuP21. Since the assembly language of the MuP21 is based on Forth it is easy to learn and use.

To define a new word in assembler one needs to use the Forth word CODE. Like colon CODE takes a name from the input stream for name of the new word being defined. Words in assembler normally end with the next function which returns control to the next Forth word.

The following sequence:

            CODE MYCODE ( -- )
              next
            END-CODE
Will define a new word in assembler called MYCODE. This shows how CODE and END-CODE, and next are used.

The MuP21 microprocessor has two small on-chip stacks in hardware. The data stack on MuP21 is 6 cells deep, and the return stack on MuP21 is 4 cells deep. There is a register that is used for memory addressing called the `A' register.

MuP21 accesses 20 bit wide cells of memory. These 20 bit wide cells can contain data or instructions. MuP21 only has 24 instructions, so these instructions may be represented with only 5 bits each. Thus a 20 bit cell in memory can contain up to four MuP21 instructions. It is thus possible for the CPU to execute these instructions up to four times faster than it can access memory. Assembler routines in MuP21 are normally written to show how the instructions are packed into words in memory for clarity.

Since P21Forth must support stacks larger than the hardware stacks provided on MuP21 the P21Forth program must maintain stacks in memory like more conventional processors.

P21Forth Register Use

At the start and end of all words in P21Forth there are three registers on MuP21 which must be preserved. The A `` '' register will always hold the interpreter pointer `` (IP) '' in Forth. The top of the data stack register will always hold the ``data stack pointer'' (SP). And the top of the return stack register will always hold the return stack `` pointer'' (RP). These registers must contain these things, and they are manipulated by the internals of assembler words in P21Forth.

The data and return stacks in memory in P21Forth are designed togrow upward. Each time an item is added to a stack the pointer to memory is incremented. Each time an item is removed from a stack the pointer to memory is decremented.

The P21Forth word DUP does two things. It duplicates the top item on the Forth data stack in memory, incrementing the data stack pointer in the process, and then advances to the next word in Forth. Here is a definition to do this:

\ CODE DUP ( n -- n n )
\ at the start of this word A=IP  T=SP and  R=RP
\ ( n -- n n  ) is a stack diagram showing an item being duplcated

CODE DUP ( n -- n n ) \ create a new word in assembler called DUP
 a push a! @+ \ these four instructions assemble one MuP21 memory cell
              \ a push gets the IP from the A register and stores it on the
              \ on chip return stack. Then a! moves SP into the A register.
              \ @+ fetches the top item from the P21Forth stack in memory
              \ places it on the top of the MuP21 hardware stack, and
              \ increments the A register (SP).
 ! a pop nop  \ the data stack pointer (SP) has been incremented the !
              \ instruction stores a copy of what was the top of the memory
              \ stack into the top of that stack. The a instruction then gets
              \ a copy of the new data stack pointer and places it on the
              \ top of the MuP21 hardware data stack where SP should be left.
              \ pop gets the IP from the hardware return stack and then
              \ then the a! instruction puts the IP back into the A register.
 a! next      \ go to the next Forth word
END-CODE      \ end this assembler definition
Of course the comments are not needed to make this definition work. This example is intended to show how there are three registers that must hold certain things at the start and end of each word written in assembler.

MuP21 Assembler Instructions

   CODE Name     Function

   Transfer Instructions
   00   JUMP     Jump to 10 bit address in the lower 10
                 bits of the current word.  Must be the
                 first or second instruction in a word
   01   ;'       Subroutine return.  (pop the address
                 from the top of the return stack and
                 jump to it)
   02   T=0      Jump if T=0
   03   C=0      Jump if carry is reset
   04   CALL     Subroutine call.  (push the address of
                 the next location in memory to the
                 return stack, and jump to the 10 bit
                 address in the lower 10 bits of the
                 current word.)
   05            reserved
   06            reserved
   07            reserved

   Memory Access Instructions
   08            reserved
   09   @A+      fetch a value from memory pointed to by
                 the A register, place it on the top of
                 the data stack, and increment A
   0A   #        fetch the next cell from memory as a
                 literal and place it on the top of the
                 data stack
   0B   @A       fetch a value from memory pointed to by
                 the A register, place it on the top of
                 the data stack, and increment A
   0C            reserved
   0D   !A+      remove the item in the top of data stack
                 and store it into memory pointed to by
                 the A register, increment A
   0E            reserved
   0F   !A       remove the item in the top of data stack
                 and store it into memory pointed to by
                 the A register

   ALU Instructions
   10   COM      complement all 21 bits in T (top of data
                 stack)
   11   2*       shift T left 1 bit ( the bottom bit
                 becomes 0)
   12   2/       shift T right 1 bit ( the top two bits
                 remain unchanged)
   13   +*       Add the second item on the data stack to
                 the top item without removing the second
                 item, if  the least signifigant bit of T
                 is 1
   14   XOR      remove the top two items from the data
                 stack and replace them with the result
                 of logically exclusively-oring them
                 together
   15   AND      remove the top two items from the data
                 stack and replace them with the result
                 of logically and-ing them together
   16            reserved
   17   +        remove the top two items from the data
                 stack and replace them with the result
                 of adding them together

   Register Instructions
   18   POP      move one item from the return stack to
                 the data stack
   19   A        copy the contents of the A register to
                 the top of stack
   1A   DUP      copy the top of stack to the top of
                 stack
   1B   OVER     copy the second item on the data stack
                 and make it the new top of the data
                 stack
   1C   PUSH     move one item from the data stack to the
                 return stack
   1D   A!       move the top of stack to the A register
   1E   NOP      null operation (delay 10ns)
   1F   DROP     discard the item on the top of the data
                 stack
The P21Forth assembler provides structured flow control. IF, ELSE, and THEN can be used just as they would in high level Forth code. However it should be noted that the IF in the assembler does not remove the flag from the data stack as does the standard high level IF. Chuck has also introduced a similar operation -IF. -IF compiles a C=0 instruction and therefore tests for carry. -IF will execute the code that follows if carry is set, or it will jump to the ELSE or THEN if carry is not set. BEGIN, WHILE, UNTIL, and REPEAT are also supported in the assembler. Chuck has also introduced the -UNTIL which compiles a C=0 and loops until there is carry.

The next word assembles three opcodes that perform the advance to the next Forth word. This is know as the Forth inner interpreter. next assembles @A+ PUSH ; The @A+ fetches the next Forth word pointed to by the A register (IP) and increments the IP. Then the PUSH ; sequence pushes the address to the return stack and then `returns' to that address to execute the next word.

MuP21 is designed to match the hardware on the DRAM chips, which have 1K sized pages. Two addresses are on the same page if they have the same upper ten bits. Care should be taken to ensure that words written in assembler do not contain jumps or calls that are expected to go to a different page. They would not jump or call to a different page of memory with a jump or call instruction directly. A sequence like PUSH ; is needed to jump to an off page location.

Math in Assembler on MuP21

There are several things to remember when coding math on the MuP21 microprocessor in assembler. Items read from memory are only 20 bits, but the CPU registers and math operations are 21 bits. The most signifigant bit is both carry bit and a valid addressing bit to memory. If the most signifigant bit (carry) is set in address used for a memory reference then the SRAM memory will be addressed. Addresses 0-FFFFF are in DRAM, but address 100000 up are in SRAM.

This means that if you if you load a 20 bit -1 (FFFFF) from memory and add it to 1 you will get 100000 which is not the same as 0. If you add 1 to a 21 bit -1 (1FFFFF) then the result will be 0 because carry will be reset. Since you cannot store a 21 bit number in memory directly it is done by complementing the number with COM then storing it into memory. When it is fetched COM is used again to reset the lower 20 bits and to set the carry bit. Since -1 is often used to decrement numbers (MuP21 does not have any auto- decrement instructions) there is a faster way to generate a 21 bit -1 than to load a literal 0 and execute a COM instruction. The instruction sequence DUP DUP XOR COM is a faster way to generate a 21 bit -1, but it also uses one extra location on the data stack.

The MuP21 uses a ripple carry mechanism on the + and +* instructions. The carry in the add will move upward through eight bits in the time of a single instruction. This means that the result of adding 1 to 1 is ready in one instruction time, as is the result of adding 127 to 127. But adding 1 to -1 would require carry to more through 20 bits in the process of the add, and this takes longer than one instruction time. To compensate for this a NOP or two may be needed before the + or +* instruction. There will be no need for a NOP if the + or +* is the first instruction in a word of DRAM. The extra delay needed to fetch the word containing the + or +* in the first instruction from DRAM will ensure that there is sufficient time for a correct result from the addition.

Other Programming Considerations

The amount and nature of memory access will generally be the limiting factor in the speed of execution of MuP21 programs. DRAMS can access memory on the same page in about 55ns, but memory accesses to a different page will take 150ns. For this reason it is very important to try to keep critical routines to one page of memory and if possible to let them manipulate data on the same page as the code. For this reason the default data and return stacks in P21Forth are on the same page of memory as the most frequently used words in the Forth kernel.

Chuck's code and Dr. Ting's code in the OK Operating System and the code in the OKAD application are very good examples of techniques to get the most speed from MuP21 assembler.

Source code to the P21Forth Assembler.
\ ASM.FOX Chuck Moore's 20 bit assembler for MuP21
\ modified for P21Forth Jeff Fox 10/6/94

HEX VOCABULARY ASM    \ create the wordlist for the assembler

: ASSEMBLER
 ALSO ASM ;           \ ASSEMBLER adds ASM to wordlist

ASSEMBLER DEFINITIONS

: END-CODE            \ get out of assembler
  PREVIOUS
  DEFINITIONS ;       \ and put definitions wherever that is

VARIABLE HI           \ pointer to current slot
VARIABLE HW           \ pointer to current word under assembly

: ALIGN ( -- )        \ 0 1 2 3 .. 4
  4 HI ! ;            \ force slot pointer to overflow

: ORG ( a -- )        \ ORG to an address
  DUP . CR DP !
  ALIGN ;             \ DP is the eForth CODE POINTER   H in OK

CREATE MASK ( -- a )  \ 4 masks for 4 slots scrambled bits
AA800 , 55400 ,
32A   ,    D5 ,       \ 1 CELL per mask on MuP21

                      \ compile pattern
: P, ( n -- )
 AAAAA XOR , ;        \ Patterns must be xored AAAAA

: #, ( n -- )         \ compile number
  , ;                 \ Numbers are normal on MuP21

: ,W ( mask -- )      \ or in masked bits into word
  HW @ @ OR HW @ ! ;

: ,I ( inst -- )      \ assemble instructin in one slot
  HI @                \ check slot pointer in HI
  4 AND               \ overflow?
  IF                  \ so align slot pointer
    0 HI !            \ clear HI
    DP @ HW !         \ point HW to current location of DP
    0 , THEN          \ move to next clear location
  HI @                \ HI points to current slot 0-3
  MASK +              \ add offset to start of MASK table
  @ AND               \ AND in the mask bits
  ,W                  \ assemble instruction to current slot
  1 HI +! ;           \ bump slot pointer by 1 CELL

: INST ( n -- )       \ defining word
  CREATE ,
  DOES> @ ,I ;        \ Chuck' CONSTANT DOES> is not ANSI

6A82A INST COM        \ com com com com
55956 INST NOP        \ nop nop nop nop

: JPI ( n -- )        \ assembler jump instruction
  CREATE ,            \ Chuck's CONSTANT DOES> isn't ANSI
  DOES>  @ ( -- a )
  BEGIN HI @ 2 AND    \ skip slots 2 and 3
  WHILE NOP           \ by assembling NOPs
  REPEAT              \ then
   ,I                 \ assembler the branch instruction
   3FF AND 3FF XOR
  ,W ALIGN ;          \ assemble 10 bit address in slots 2 & 3

: BEGIN ( -- n )      \ start a loop structure , leave addr
  BEGIN HI @ 4 AND    \ check for word boundry
  0= WHILE NOP        \ assmbler NOPs if needed
  REPEAT DP @ ;

: # ( -- n )          \ assemble a literal
  99BE6. ,I , ;       \ assembler the instruction n and literal

: -# ( -- n )         \ assemble a 21bit negative
  FFFFF XOR # COM ;   \ complement then assemble lit & add COM

: P ( n -- )          \ assemble a pattern as literal
  AAAAA XOR # ;

: -P ( n -- )         \ assemble a complement pattern literal
  55555 XOR # ;

AAAAA JPI JUMP  A9AA6 JPI T=0
A96A5 JPI C=0   A6A9A JPI CALL
A9AA6 JPI UNTIL A96A5 JPI -UNTIL

: IF     3FF T=0  HW @ ;
: -IF    3FF C=0  HW @ ;
: SKIP   3FF JUMP  HW @ ;
: THEN   DUP >R >R
  BEGIN 3FF AND 3FF XOR  R> @ XOR  R> ! ;
: ELSE   SKIP  SWAP THEN ;
: WHILE  IF  SWAP ;
: REPEAT  JUMP  THEN ;

9A7E9 INST @A+  997E5 INST @A
967D9 INST !A+  957D5 INST !A
6A429 INST 2*   69826 INST 2/
69425 INST +*
6681A INST XOR  66419 INST AND
65415 INST +
5A96A INST POP  5A569 INST A
59966 INST DUP  59565 INST OVER
5695A INST PUSH 56559 INST A!
55555 INST DROP
AA6A9 INST ;'

: next                \ next macro in eForth assembler
  @A+ PUSH ;' ALIGN ; \ compiles @A+ PUSH ; and ALIGNs

PREVIOUS DEFINITIONS
ALSO ASM              \ CODE is a Forth word

: CODE ( -- )
 HERE HEAD, REVEAL    \ create header in eForth for HERE
 HERE HW ! ALIGN      \ start assembly at HERE
 ASSEMBLER
 DEFINITIONS ;        \ any more defintions go into

PREVIOUS

End of Chapter 9

Return to Table of Contents

Previous Chapter

Next Chapter