ENTH Flux aha Color Forth

ENTH Flux aha Color Forth
On November 30, and December 01, 03, and 04 of 2000 I exchanged some email with Sean Pringle. He and I discussed his ENTH and Flux projects and Machine Forth, Color Forth and aha.
With his permission I am posting most of the email from our ongoing discussions because I think it will be of interest to some other people.
On 12/07/00 Sean sent me the URL of his first online documentation on the State of Flux.
Subject:  Re: Colour / aha / Machine
Date:  Thu, 30 Nov 2000 23:09:29 -0800
From:  Jeff Fox 

Hi Sean,

Sean Pringle wrote:
><
> Hi Mr Fox.
> 
> My name is Sean Pringle. I'm a University student in Australia and (through
> the fault of my Uncle!) have been highly interested in Forth for almost two
> years.
> I've been rumaging through your plethora of articles on the UltraTechnology
> Website for a couple of weeks now. Especially the few about Chuck Moore's
> activities. They're very interesting! I like the ideas.
> 
> Having finished (sort of) my ANS Forth called ENTH (the n'th forth!) for my
> Pentium about a month ago I've tackled a Color Forth style system. I'm doing
> these systems mainly to learn but with higher interests in mind for down the
> track! I used the Pentium because it's all I have.
> 
> I was going to post on comp.lang.forth but I saw your reception in the
> 'compiler innovation' thread and wasn't brave enough :). I thought, with
> your experience and your new system (aha?) you could tell me if I got the
> point or missed anything?
>
> My new system is called Flux (state of!). It has things from Color Forth and
> Machine Forth I believe. I think I interpreted your articles (the fireside
> chats and your Thoughtful Programming mainly) correctly but I read a few
> critical sentances a lot!
> I've managed to get tail end recursion optimization going. The color tokens
> work, I have eight or nine so far. There are also a few compiler directive
> words. I've tried to keep the dictionary small and simple it currently has
> just over 100 words. I have the A & R registers and the tail end recursion
> looping construct.

It sounds like a virtual machine modeled after Chuck's more recent
hardware designs.  He is using such a virtual machine for his Pentium based code
in OKAD and his Color Forth.

One of the main differences in Machine Forth style is that you use A a lot
to not only split traditional @ and ! opcodes and insert auto-increment
into code, but also sort of like one fast local.  It changes the way
you factor things.

Also the smaller stack sizes we have, and the A register, typically lead
to different looping constructs than those used in traditional Forth
DO LOOPs are expensive in the extra stack positions they use.  After
working for a few years on P21 with only 5 or 6 deep data stack and
4 deep return stack we really got out of the habit of using DO LOOP
or +LOOP constructs.

The MISC chips have the carry bit on the stack and offer -IF -UNTIL and
-WHILE which are often a useful alternative to IF UNTIL and WHILE and
carry may be easier to generate than zero from many things.

Chuck often used left justified numbers and would use + or +* in a loop
to eventually generate the carry that terminates the loop.  The restrictions
are that +* will require adding a even number to an odd number so it will
never generate zero in these loops.  Incrementing or decrementing a counter
is an expensive loop counter.  using autoincrement with a pointer in A
or R and testing for it equal to a termination value is quite common.

Much of Machine Forth is looking at the actual machine and possible 
variations in code and being able to make the choice for the one that
is obviously faster or smaller.  There is not the ambiguity there is
when you don't know what the code is going to translate into as there
is with multiple targets etc.

You are kind of in the middle, as is Chuck with his OKAD and Color Forth,
because you are implementing a MISC style virtual machine (with A register
at least) and implementing it on a Pentium machine.

> There aren't many control structures. My 'accept'
> function pretty much encompasses the editor requirements. The numbers are
> converted at edit time and the words made into counted strings. I'm still
> using Ascii characters for this.

Ascii is fine. I may use packed Ascii characters too in Aha.  I don't need
an optimized character set like Chuck built for OKAD.  I can see how and
why he went to using Huffman encoding for his character set but I
haven't seen an need to do that.

In looking for ways to compress source code I had already considered using
a technique from the traditional Forth compiler, a name dictionary, in
the source code.  It is used by the editor and compiler.  For one things
I need only store one copy of any defined word ascii string.  All other
source code references to a definition do not need to be in string format
in Aha.  They are just pointers into the defined name dictionary.  So
I can compress a word of any lenght into ten or twenty bits, and have
an incredibly fast compiler that does not have to look up names, just
follow a pointer to a CFA.  It all the defined words (after they are
defined) into ten or twenty bits rather than five or six (average)
ascii characters and a space.  That's from 5/1 to 5/2 compression
and 1000x faster compilation of defined words in Aha.

But the other tricks from Color Forth help too.  The tokens that
allow numbers and opcodes and other things to be picked up faster
by the compiler without parsing and searching a dictionary.

> The compiler loop is pretty darn quick!
> Both ENTH and Flux are native, stand alone systems written in NASM. I don't
> really like programming for Dos Windows or Linux, so I'm just making my own
> standards up as I go along!

How stand alone are these?  Completely stand alone like Chuck's new Color
Forth?  He has a boot sector and that is about it.  Then he reads complete
tracks into memory and that's about it.  Chuck not only wants to completely
avoid Linux, Windows, and DOS, but as much of BIOS as he can too.

Are you using some canned boot routines like from BSD or an RTOS or DOS or
did you write a primitive level boot routine like Chuck has done for the
Pentium and (of course) we do on MISC chips?  
 
> The resulting code is very nice. I was a bit apprehensive of the colors at
> first but using the color tokens is quite nice. The system is probably

One of the things about Aha is that it uses tokenized source.  This means
the compiler, editor, debugger, etc can do some new things.  I can do more
error checking at edit time because more of the traditional compilation
techniques happen then.  So errors can be caught when they are most
relevant and in the mind of the programmer and dealt with sooner.  Many
errors can be caught at edit time.  I can even implement a smart editor
that looks over your shoulder and helps write code.

But the editor or editors can be different than the compiler. I can
have a Machine Forth editor, a Color Forth editor, an ANSI Forth
editor, and HTML-ANSI editor, etc. that all use the same compiler
and same tokenized source.  The look and style of the source
code is factored separately from the compiler and internal
representation.

> around 9k and runs off a Floppy. It's this size because of the Pentium 32
> bit architecture and the setup tables you need for that processor to run in
> protected mode.

That sounds pretty good.  9K is very small compared to a Floppy and even
smaller compared to memory on Pentium Machines.

It sounds like you have a real stand alone boot system like Chuck.
It sounds nice.

> Some of the functions are probably not as atomic on the Pentium as ShBoom or
> your F21 but they are faster than ENTH was! My Uncle has been building small
> general purpose microcomputers currently with motarola processors that run
> an ANS Forth of his. As soon as I can give him the outline of my Flux he
> wants us to implement it for the micros (ie - the 68HC11 model needs all the
> speed it can get!) and use the 80x86 system to talk to them.

I understand.  Sounds good to me.

> Anyway, though you might like to know what your essays inspired! Also if
> there's anything you have found that would be interesting to put in that
> might make the system faster or simpler, I'm all ears! Thanks for your
> magnificent library of articles on the net.

If you have access to a relatively fast internet link Chuck's Color Forth
2000 presentation is currently playing in the streaming video theater.  I have not
transcribed it to HTML because it was long and had lots of writing on the
board.  It explains the changes in 2000 with Huffman Encoded characters
etc. that Chuck has been doing.  I am going to remove part 2 and upload
part 3 in a couple of hours.

I am excited about your project.  Will you make it PD or use it for commercial
work?  Will other people get to see the code?  Will you do a writeup and
explanation of it and post it anywhere so I can read it or post a link to it.

how much of the information you sent me on Flux are you willing to make public?  Chuck
would be happy to hear about it.  Other people would be interested.  But
you might not get too much positive feedback in c.l.f.  You might get
some interest from email if some information gets posted there however.

I am planning to do a presentation for FIG next month on Aha.  I will
emphasize compiler internals, show code, explain design decisions, and
talk about lots of stuff but not talk about Color Forth per se.  I will
want to show the tiny amount of code it will take to compile tiny source
into tiny object code at an incredible rate.   

I am going to be using about 20K for all the stuff I see coming in aha
and a few desktop applications. The upper limit compile speed for aha
will be about 100million Forth source  words per second.  For the board
and memories I will be using in the first test I figure the average speed
will be between 2 and 20 million Forth source words per second.  To compile
20K cells I might use 60K source words, just a guess at this point.  So far
the comiler looks like it will take between 100 and 200 cells.  So the
compiler should be able to compile itself in something like 30 microseconds.
The complete system boot code, compiler, OS, GUI, editor, appliations etc.,
I don't know, a few milliseconds.

> Cheers,
> Sean Pringle

Jeff Fox

Subject: Re: Colour / aha / Machine (2)
Date: Fri, 01 Dec 2000 09:45:21 -0800
From: Jeff Fox 
To: Sean Pringle 

> Sorry, what does +* do exactly?

+* is a conditional-non-destructive add.  It is the multiply step instruction 
that Chuck wanted for multiplies in OKAD. It is the equivalent of:

: +* ( n1 n2 -- n1 n2 | n1 n1+n2 )
 DUP 1 AND IF OVER + THEN ;

If you wanted to multiply to 8 bit numbers to get a 16 result you would
shift N left eigth times then execute a sequence of 8 copies of
+* 2/ nop nop

followed by push drop pop to clean up the stack.

If you put 1 in T and 2 in N then execute 

BEGIN .... +* -UNTIL
it will add 2 to the value in T a half million times until it
generates a carry.  DUP 1 AND IF OVER + THEN in 1 clock.  If
I used the same kind of advertizing mips as other people
I might claim that F21 could thus perform seven ANS Forth
words in 1.2ns or 5600 MIPS. But I don't.

> But yes. It was a bit of a toss up sometimes. I tried not to overuse the A
> and R registers in the Flux Kernel as this meant that certain words in the
> Kernel could be used without fear of their modifying those registers. In
> loops that were interacting with the user and time wasn't so critical, I did
> it the clumsy way!

I understand.  In P21Forth A was used as the IP.  It had to be saved and
restored in most Forth CODE words because of this.  We did something
similar at iTV.

I noticed that this was very different than Chuck's native code for
MuP21.  His most common sequence was  " ... # call" where a literal
load was in same word as a subroutine call.  he tended to pass
arguments in A from word to word to word not as the IP but as
a parameter or pointer.  more assembler style than traditional Forth
style of coding.

I wrote 30 different compilers with lots of different optimization
techniques to explore the automated use of the A register.

> Immpressive! How complex is this editor becoming? Do you feel it's more
> important to compressed source code or interpret it more quickly? Realising
> that the first can sometimes lead to the second.

I haven't designed the editor (editors).  It will combine some of the features
of the machine Forth compilers and the simulator interface that uses code
decompile.  the editor is made more complicated and slower by moving
some compiler steps into edit time.  But editors are easy to do. And
moving things back, not forward in time is Forth style.

I once made a bet with a coworker when I worked at Pac Bell.  We had spent
a day installing an editor on 250 machines in our wing (7000 cubicals in
this office.) The employees were trained to use this editor to do certain
things in a class.  It was terrible overkill.  The product came in a
big box, needed 25 diskettes to do an install or upgrade, and was
as slow as you would expect for something so large.

I bet my coworker that I could write a knock off editor with all the
same keystrokes and functions that the users were trained to use
so that they could use my editor as an alternative to the big one.
I would leave out most of the advanced features that these users
were not trained to use and were not using and I would add a bunch
of useful features that the big product did not not have but which
I would find useful.

Instead of 40 megabytes the editor I would write would be small and
fast.  Furthermore Pac Bell could give it to thousands of employees
rather than spending $300 per copy for thousands of copies of the
other editor (not that they wanted to save money on such things).

The bet was not whether I could write such an editor, the bet was
whether I could deliver it the next day.  That is the editor that
I have been using most of the time for the last seven years. I
too had been trained to use this other editor so it was easy 
for me to use and it had extra features I could use and yes I did it
that night and delivered it the next day.

The first editor in aha will not have a pop-up agent with my face in a window
and my voice speaking or holding the user's hand.  It will not have all
the bells and whistles that it will eventually have.  It will be a simple
source code editor that will also pack the source code tightly for the
compiler to use.  The initial one will be pretty small and simple.

I like the idea of making the editor slower and the compiler faster.  I only
type about sixty words a minute, or a few characters per second and often
there are gaps when I am thinking about something.  Editing code is not
just mindless typing without thinking.  So it makes sense to me that the
editor should be able to do more stuff at edit time including error
checking and various forms of code optimization, source code compression
etc.

I remember one of the iTV programmers telling me that he had added a new
hard disk to his system just to hold all the copies of eMacs that he needed.
He routinely used a half dozen different versions of the editor at different
times because they all had different features that he like to use. So
he needed a gigabyte of disk space just for his editor(s).

I know that this is the way many people think.  They must use those tools.
They would not consider writing their own editor.  Writing your own editor
is very basic Forth thinking in my way of thinking.  It is a very personal
thing.  Chuck jokes about how basic and important an idea it is.
 
> Also do you most of the work while the user types or whenever you store/save
> the source? 

Yes in the editor.  A translator could take ASCII source representation and
convert it into a tokenized representation also.  

I may make several editors including some with no keyboard.  For an F21
in a mouse I could make a mouse only editor using some unusual user interface techniques.
I also like the idea of making a completely visual editor that displays source
in a flowchart format.  But yes, even with a keyboard version there is time between
keystrokes.

Chuck executes or compiles each word indiviually rather than line by line.
In fact Chuck doesn't really have lines.  I will also go word by word rather
than line by line in aha.

> Mine 'prepares' each line of source as soon as you move to the
> next line. This means that I can point exactly the same 'accept' function at
> the editor or the terminal and there should be no noticeable delay at any
> time.

I know what you mean.  Traditional Forth design requires scanning
or parsing to the end of a line at compile time.  I will avoid all of that.

> I am not sure I can push my editing as far as making the words merely CFAs.
> The reason is (and I'll have a think about this!), Flux has a basic
> multitasker which I want to implement the 'Flash Tasking' on. That is to
> say, small transient tasks that get compiled and executed on the spot to do
> a simple job then erased. Having lots of small tasks that appear and
> disapear could mean I can't know exectly where or when they will be compiled
> at runtime and thus where the various CFAs are. I'm not sure. I havn't
> decided about the feasability or use of this style of multitasking yet. It
> may be too complex a system and deafeat Flux's purpose. It may end up too
> slow!

The CFA are only set at compile time.  The slots for them, the dictionary
exists at edit time.  Direct pointers to CFA fields in the dictionary
as representation of defined words is fastest but requiers that the
source code be at a known and fixed location in memory at edit and
compile times.  Relative pointers are bit slower but would allow the
use of CFA field pointers in the source structure and still keep the source relocatable.  

At compile time when a defined word is compiled the CFA field gets
set in the source and used to represent defined words from them on.
The editor will display these pointers as the strings in the defined
word record.  Since the CFA get set at compile time when you bind
to an actual compilation address I don't see that it would be any
problem for what you are doing either.

Following a direct or relative pointer to a CFA field that gets
set at compile time to compile a defined word is going to be so much faster 
than any dictionary search mechanism and that it is very attractive.  It
is also a form of compression at the same time, that is nice too.

One important concept in Forth is pushing things backwards in
time.  It is important to do things earlier and one rather than
later and many times.  This is what I am trying to do in the
refactoring of the editor and compiler.  It is what Chuck does.

> Everything is all my own code. And yes, some of it is fairly primitive! I
> have a simple boot sector which uses BIOS to do the initial single track
> read to get the system image. This is the only time BIOS gets used. Once the

Very nice.

> system 'lives' it has it's own hardware interface routines though only the
> basics. (Much sweat, blood and tears given to get some going!) It's knows
> nothing of any other Operating System! We were thinking about putting Flux
> of ENTH into a boot rom to replace BIOS on the PC but that hasn't happened
> yet :)

Very cool.  I made a Forth ROM for the first micro I ever owned.  It was the
first commerical microcomputer to have a boot ROM!  The models before that
used a single step circuit and a front panel to toggle in your boot 
code bit by bit whenever you started the computer. You could use this
to enter your loader and load more from a paper tape or cassette.

I have always felt that personal computers should boot up in a fraction of
a second.  It is rather pathetic that boot times have gotten longer and
longer as the processors have become faster and faster.  Hurry up and
wait.

I joke that the Windows interface is not a point and click interface.
Often I know what is going to happen when I click.  I point and click
then move the pointer to the next location I want which usually takes
a fraction of a second.  Then I wait, and wait, and wait for Windows
to respond.  The game I play is to try to have the arrow already
positioned over the next button I want or the next menu or whatever
so that when it eventually comes up I am already there.  I make it
a sort of game, it gives me something to do while I am sitting
there waiting for Windows to respond.  I am pretty good at being
way ahead of where Windows is in real time.   If I have to move
the arrow a little to actually position it over the new button
that I want I try to learn to anticipate Windows (batch like operation)
better next time.  It is kind of pathetic actually.

I remember the first time that I read that Microsoft said that
if you were running MS Word on your Mac you would need a 68030
because a 68020 was too slow to keep up with your keystrokes
if you were a fast typist!  I had thought that the 68020 was
a killer CPU and they didn't think it could keep up with my
typing!


> >Sometimes I forget that one still can only bring up these topics
> >in c.l.f at your own risk.  There is a list of people who will try
> >to make you look like an idiot should you mention any of these
> >"controversial" topics.
> 
> That's why I didn't even try.

I can understand. I liked the Poem that Chuck presented at Forth Day 2000
about coloring outside of the lines.  "Somedays I don't have the courage
for it at all."

> Yep, I've read the 1999 and 2000 Fireside chats. As I said, some sentances
> fairly carefully! I'll keep an eye on the aha pages.

Yes, you do seem to have not only grasped the concepts but dug in for
hands on experience and knowledge.  That is great.  Maybe Chuck and
I can learn something from what you discover in the process.  Maybe
other people will be able to learn something from it too.
  
> Both ENTH and Flux would be PD I imagine. I have no commercial application
> for them! Even so, Flux took about two weeks to write. I could just do
> another for commercial activities someday. Yes, I will do a writeup. If only
> for my own documentation purposes. I would certainly post it. You would be
> welcome to read and link to it! Yes others could see the code if they
> wanted. There's not that much code to see really!

cool.
 
> Wow, Chuck Moore hear about it? You know what that sounds like to 
> us mere mortals?
 
Chuck likes to hear about people who understand his work and try
their own experiments.  That is what Forth was like in the old
days before it became 'standardized.'  Chuck likes to hear about
other people experimenting with MISC code, Machine Forth,
Color Forth, and the associated concepts or other new ideas. 
 So do I and so do a bunch of other people.

You may find that the MISC mail list is a more productive forum
for discussions of these kind of ideas than say c.l.f.  I know that there 
are people there who would be interested in your work on ENTH and Flux.
 
> It can all be public. Not worried on that score. It would be good to have
> others interested! I know personally only one other Forth Programmer and
> he's family! Mind you, I would rather people show absolutley no interest at
> all than stop to tell me why they have no interest, unless it's polite nasty
> comments!

"The teacher frowns.  The other kids call me dumb, or wierd, or retarded."
I know what you mean.  Chuck knows what you mean.  But it goes with the
territory.  Not everyone wants to explore uncharted spaces.  You never
know if you will find something interesting or not.  There are no road and no maps.

I guess I have tried to leave maps for other people and that Chuck has
also made an effort to do so.  But even so it is still a small group
of people who will want to go into the wilderness even with a map.
Even today, most people in the world never go very far from where
they were born.  Most people are not explorers or pioneers.  But
we just leave them behind and don't worry about them too much.
They have a right to stay home if they want to and think we are
crazy for wanting to explore new spaces.
 
> Those compile times you quoted are certainly 'before the finger leaves the
> key jobs!' I've not timed Flux yet but I don't expect anything like that.
> Maybe one day!

The biggest factor is completely eliminating dictionary searching at
compile time.  Look into what I said about the CFA mechanism above.
The upper limit I got for some code on an F21e in SDRAM is
compilation of about 120 million Forth words per second. (that
was NOT for compiling on 1000 nodes in parallel! hum, 120B Forth
words per second compilation speed...) The numbers I gave you 
were based on a low 2M estimate for F21d in DRAM on more average code.  

If you do the math on compiling a few hundred words of source 
when the upper limit is 120M Forth words per second you get very small numbers. 
Beating windows by 100x or 1000x on a lot of things is not
very difficult.  Sometimes we can beat windows apps by 10^6
or 10^9 on some things, that is more fun.

For this reason I often ask people to post units.  I see
the same thing expressed in Seconds by some people in
milliseconds by others, in microseconds by some, and 
occasionally in nanoseconds.  When you talk to Chuck
about hardware things move into picoseconds and femto
seconds terms.  (then there are the optical computing
and other exotic technology people who also use
sci-fi numbers and some of them  realize that Chuck's style of
designs would be a good fit to what they are doing.
Single Electron transistors, quantum tunneling transitors,
diamond semiconductor, optical transistors etc. Those folks
would not consider implementing a Pentium, but a MISC
chip is doable. These were the futures that got me interested in 
Forth machines fifteen years ago.)


> Once again, thanks for your interest. It's encouraging!
> 
> Cheers,
> Sean Pringle

Your welcome.  Thank you for your interest and your work on 
similar ideas.  It is nice to exchange ideas with someone who
gets the concepts.  It is much better than dealing with the
people who have knee jerk defensive reactions or who put
their hands over their eyes.  

Jeff Fox

Subject: Re: Flux rewrite
Date:  Sun, 03 Dec 2000 22:10:02 -0800
From: Jeff Fox 
To:  Sean Pringle 

Hi Sean,

Sean Pringle wrote:
> 
> Hi Jeff.
> 
>  I wrote another version of Flux on Sunday. My coding productivity seems to
> have increased markedly using the single line definitons. I am having much
> more fun now! It's never been this fast! 

Good signs.  Impressive.  Another form of aha experience, and fun is imporant.

> Anyway, I implemented the CFAs in source code ideas you were explaining. I
> am not sure what format your source took exactly so I made mine up as I went
> along! I think what I have are what you called relative pointers. The actual
> source is a string of pointers (and color tokens, etc) that are offsets into
> a list of all words used in the source. The list entries look similar to the
> dictionary headers and contain both the word as a counted string and a CFA
> slot. CFA slots for kernel words are filled at edit time and CFA slots for
> newly defined words are left empty and patched at compile time as soon as
> the word is defined. It seems to be fairly quick.

That is exactly one of the ideas I considered and mentioned to you.  I plan
to implement a direct pointer compression structure, but who knows, I might
decide to change to a relative pointer to make it freely relocatable.
At the moment I am trying to get as the first target the fastest possible
approach and second the most compressed represntation of source and
source/object combinations.

So I plan to combine basically three main things, compressed defined
words using CFA pointers to a dictionary as above.  Second are tokenized
words that are represented with only a few, five or six bits.  I would
prefer five and could do it but I think six is better, I will have
to profile real code.  Also real code will have a way of adapting 
to the best representation in the system as well so it is a little
hard to picture those details at this stage.  The tokenized Forth
words can be packed so they create smaller source and can be read
at about the same speed as CFA pointer words.  The third is the
packed opcode format.  This is for whatever percentage of the
code can be represented as packed opcodes without further
symbolic representation.  The rate for simply moving these
4 opcodes per word structures and/or removing them from
the source object is of the course the simplest and fastest
thing the compiler can do.  It was the first code I wrote,
to recoginze the opcode token records, get the count,
and transfer them from source to object.  That is what
sets the upper limit in aha on F21 at about 100,000,000
Forth words (opcodes) per second.  

The other thing that is nice about this 100M Forth word per second compilation,
it only takes a half dozen words of code to do it.  That
sort of thing sort of has to be that way, the fastest
routines must be short ones.  Sometimes computed jump
code arrays are fairly big.  It is the technique
that Chuck and I have both used for inlined pixel transfer
code in the GUI.

I could leave out the function tokens and do aha with
just CFA pointers for everything.  It would be slower
but slightly simpler, it would cause some of the source code to be 3 or 4 times larger.

BTW like code tokens the representation of everything
can be done without redundant representation after
compilation.  You can throw away the CFA pointers 
for defined words and only keep a copy of the
dictionary of CFA and counted strings so that
defined words can be displayed and edited by name
in the editor or debugger.  You need only keep
some marks to where things got moved for opcode
token records so you can further compress the
source representation in memory or on storage
once it is linked to object code.  The entire
thing, all Forth words, comments etc. are
still there and look the same in source view.
But once again this is a feature of the system,
that source and object code are incredibly small
for linked source level debugging and the tools
are also incredibly small and fast.

The functional tokens will use a simple jump table so
they only take a few memory accesses and are almost as
compressed in representation as the opcode tokens.  The
defined word pointer token records will also be able
to be processed very fast and compiled very fast.  I
don't have actual performance numbers because I haven't 
written the code yet.   You are ahead of me already in your
implementation.  That's nice.
 
Speaking of GUI, Chuck has a background task to update the
graphic display.  It was running at something like 35 to 70 Hz.
That means the actual screen would be repainted 35 times a
second by the CAD software or anything else running in 
Chuck's Color Forth.  He recently got the new graphics board
programming working and said that OKAD (and therefor Color Forth
too I guess) will be in hardware assisted 3D.

> Does this seem to resemble your idea of things? I get the idea it might be
> still somewhat more basic than aha's plans as it still simply builds
> threaded code for the main part whereas aha would have a significant number
> of opcodes directly coded into the source and coded inline by the compiler?
> And more?

Yes those are my ideas too. CFA pointers for defined words, and function
and opcode tokens are some of the things I am doing and a few more.
aha is certainly about recognizing the match when you see it and saying
"aha!" I had the aha experience about the representation of defined words
with CFA pointers and a dictionary in the source code.  I had other aha
experiences regarding the way I have Forth opcodes that correspond
so simply to source code in some cases and that I can also use a
function table as a third compressed source/fast compile mechanism.

I have been working with Machine Forth in many different forms for years.
I have tried it many ways but have only been watching and
reporting what Chuck was doing with Color Forth and new ideas there.  
Machine Forth is pretty old stuff to me now after ten years.  I wanted to 
try new ideas and I knew the goals. 

Machine Forth programs on F21 are remarkably small and fast.  Compiling
them is relatively fast with a conventional compiler but memory limitations
are a factor when you start talking about source code/compiler on flash
on these potentially very small systems.  I had been working with a lot
of tools with debugger  and symbolic debugger interfaces but had not
done one with complete link to full source code including comments
on the target system mostly because of size.  ASCII string source
representation takes up too much space that you want for more programs
and compressed bitmap images, sounds etc.

I knew what Chuck was doing with tokens was potentially much faster
than traditional Forth compilation techniques so I examined very closely
what exactly Chuck was doing, what I was doing, what I wanted to do,
what all the possible pieces might be, and how they might fit together
and I said, "aha! I've got it. I can see how to compress the source
code way down and compile at fantastic speed at the same time and
several different ways too!"

Yes, I plan to do more than just the CFA pointer and dictionary in source
represenation as a source code compression and fast compiler technique. 
That is part of aha.  I may get to something very much like what you are
now doing or that part might just be a little different in my implementation.
It will be most interesting to compare those kinds of details.  I'll bet
we sort of have to do a lot of things in a similar way, but we have
not discussed so many details that there are so many design decisions
one can make about implementation details.  And you are on a Pentium
and I am not.  That is a big difference in itself.  You are doing
something more like Color Forth that way and with an aha compiler
feature of using CFA/String pointers to represent source words.

It is clear to me that Chuck is looking for things that work well
on his chips, like F21, but is also making decisions about what
maps well and what works well on the Pentium.

> I think I am starting to run into the Pentium's non-MISCness.

I am sure you must have to confront a lot of that.  I know there are
a terrible number of details on a Pentium to deal with.  I can only
admit to Intel assembler up to 8086-80286 and stopped looking at
assembler details like I did before after 386. I was very focused
on studying various Forth chips.  But I know enough about them
to follow the things that people who work with them at that level
for a living or like Chuck does when they talk about all the things
that you have to consider to make wise choices in coding.

I tell people that I feel sort of spoiled that way.  I mean things
have just been so simple whenever I have done coding on these 
things.  I hear Chuck complain about Intel headaches and
other people brag about how they understand the complex details.

> My editing routines still form a line-editor as oposed to a word-editor.
> This allows me (and my habits!) to work on a single line of source using
> conventional editing styles like single character backspacing, etc. The line
> of source is converted into pointers and list entries as soon as I move to
> the next line. This could result in an editor something like the F83 line
> replacment style I suppose though I've yet to find out!

I haven't designed my editors yet for aha.  I have various user interfaces
for the various editors I have done.  Chuck has talked about the details
of his editor in Color Forth.  It is a full screen editor but the focus
is the center of the screen where you edit.  It only moves left and right
and only on word boundries.  Changing a word means replacing it.  He
has a very simple and primitive interface.   I expect I will start
with something similar but a little more full and conventional. I
may not even make the first one a Color Forth interface.  As I say
I haven't decided on what editors I want.

I have thought about one for a mouse only.  Well two for mouse only.
Thumbwheels for opcodes always worked nicely. Function tokens might
work the same way.  I had thumbwheels, as it were, in the first simulators
and wrote them before I saw how Chuck did the same thing in the OKAD
chip code editor.  As he said it works nicely when you only have five
bit opcodes but turning an instruction wheel on a Pentium does not
work very well ;-) 

I have various editors for full screen text editing, but mostly have
thought of ways to do it without a keyboard as well.  Using graphics only
and thumbwheels and pull down menus and pop up virtual keyboards with smart
dictionary assisted word completetion.  and all of that is independent
on whether you make it look like Machine Forth or Color Forth.  Well
an inteface that looks like graphic flow charts and symbols would be
quite different and it is the other wierd mouse only interface I have
thought about.

Maybe I shouldn't be thinking about all the editor variations I could
do at this phase.  I need a compiler, then I need source code for
the compiler.  That means a translator or editor and I am sure I
will just start with something based on something I have rather
than start with the final choice of editor.

I will first target compiler the aha compiler using Machine Forth.  Then
produce an editor to pack aha source code.  then I can
bootstrap an editor as aha source code and metacompile changes
to aha and the OS and GUI as originally intended.  So having the
ideal editor on the target is still down the line a ways.

> The resulting source code has no 'line' structure though, so the compiler
> just runs straight on through, word to word, without needing to scan for the
> ends of lines. I think though that I have enough information in the
> structure of the source to enable a fully fledged editor to reformat it back
> into the original lines. A compromise!

It was a requirement for my design, fully recoverable, no loss of anything
considered source.  The only "loss" is that you won't be able to edit
the source with the Ascii text editor that you are using now. But as I
say, you are not using an Ascii text editor on F21 right now.  So there
is no real loss at all.

Of course people who must edit with Word or Emacs or some other thing
like that may complain that you are not using normal ascii source.
Right.  I can complain that they do.  But I don't need to, I can just
show them that I have all the features I wan:  small size, high
compilation speed, simplicity, the ability to write an editor
that does what I want etc.   The fact that I can't use Emacs I
consider a design win. ;-)

> I think I'll keep this line-editor setup for a while to see if it works. It
> would be interesting to see a word-editor though, as well. Maybe I should do
> a Flux with one. Have you used a word-editor a lot? Is it comfortable?

I am used to either an ascii text editor or an opcode/hex editor.  I will not
lose anything in aha it will just be easier and better when I don't have
to deal with these other constraints and can pick the constraints that
don't feel too constraining to me. :-)

As a bare metal programmer I have been known to be comfortable with 
a one line block line editor.  I have used the one liner to write a
windowed editor with touch pad control etc. so I may do something
similar and write a series of editors with increasing sophisitication.

> What I wanted to ask you was:
> If I write this up and put it on my website, do you mind if I quote yourself
> and Chuck as sources of inspiration, link to your website and explain how
> your ideas influenced mine or were directly used?

No, not at all.  Please do.   I am excited to hear about how much Flux
is like Color Forth and aha.  You know what they say about imitation
and flattery.  I was thinking of even putting some of our email conversations
onto a web page at my site because I think other people would be interested
and my site does get visited about a hundred times a day.  

If you don't mind the idea I will be sure to let you review the edited
version of the converstation before I would post links to it that would
allow other people to see it.

There are also people in the MISC mail list who would love to see
a copy of some of our conversations about the design of Color Forth,
aha, and Flux.

> Doesn't matter if not. I'm
> not really sure what to write yet. I don't really expect a lot of interest
> but I don't want to direct yet more flak onto you because I've expressed my
> ideas ineptly! Pointers?

I am sure I will be very happy with your explanations of the details of
what you are doing.  There is a lot of it that has to be different than
my implemention.  It is great that I will be able to look at what you
tried and what you did and how it worked.  More experience to learn
from.
 
> How do you come up with the times and figures for the F21 compiling? ie 120M
> words per second. Can you do this because the F21 is your processor and you
> know the clock cycles per instruction? 

Yes, and because I have already written the code and it is so easy to calculate
the timing of code on these simple machines.  You count the memory accesses
and add the memory access times, you add the time of any opcodes that
delay the prefetch of the next instruction (prefetch only applies to 
linear code and is delayed by any memory access instructions in the word).

So with the fastest memory possible I can pump words from source to
executable object at about 30M words per second.  Each word represents
up to four Forth language words compiling 30M cells means compiling
a maximum of 120M Forth source words presented by those opcode tokens.
Defined words are not compressed as much or as fast.  Function token
words are almost compressed as much but not as fast either.  

So it would be truthful to say the maximum "burst rate" of the compiler 
is about 120M Forth source words per second.  Slower memories, and
more defined words and even comments will slow things down.  But
even if I use a single string of records for everything the compiler
just jumps over comment strings with a count.  It doesn't parse them by
character or anything that slow!  A source level debugger/editor also needs
to link, display and edit comment records.  But they don't slow the
compiler down much.  I don't know what the actual speed will be
but based on the fact that the object code is so damned small
and much of it often is unambiguous opcode sequences that can
be represented by the opcodes themselves as tokens and that
the other types of records like CFA pointers for defined words
will also be fast and the records are small and amount of
object code that does the job is small everything should
be ridiculously fast.  I figured 2M words per second for
an initial guestimate overall.  Then figure how much source
code it takes me to generate real code.  The F21 in a Mouse
demo, that has been confused for Windows, is 600 words of
object code total.  The boot code, the OS, the GUI, the
graphics code and application is 600 cells total code. and I 
didn't try to squeaze it or anything just keep it simple, and
it is just a simple GUI desktop. ;-)  Now add a hundred
words for the aha compiler.  How long should it take
for aha to comile itself or the one of these tiny
systems?  However you figure it the numbers are ridiculous.
At say 2M source words per second to get 100 object words?
It is not like I am compiling megabytes of code from
slow OS Files or dealing with brain damaged API or watching
virus protection software slow things down another 10x or anything
like that like other people have to deal with.  On boot some stuff 
comes from ROM or FLASH into memory.  Then bang, in a few microseconds
you compile whatever you want.

> I can't seem to find exact times for
> Pentium instructions, people say that I can't because of the unpredictable
> effect of this processor's pipelines and predictions etc. I suppose and
> figures would have to be 'on average'.

There is tick timer on Pentium.  It gives very precise timings of code
sequences.  If you don't have any interrupts causing unpredicable timing
on the pipeline stalls and cache misses that they will cause you can
get simple and very precise timing from the machine.  I know Chuck
does this.  I know other people do it.  It is Pentium only, not
386 or 486.  If you have interrupts or multitasking causing things
to vary then you need to take a bunch of timings and do an average
and worst case.  Phil Koopman wrote a great article for the Embedded
Systems journal about determinancy and embedded systems and Intel
chips.  It was very illuminating.  The real problem for real-time
are those 1% profiles that take 100x as long as normal!  So the
shorter the sequence being timed the more variation that is possible.
But you can get very precise cycle timing on Pentium.  Someone
gave me an example in email this week.

> I would like to compare ENTH and Flux compiling similar applications and see
> what the new techniques can do!

Me too.  I think other people would too.  The aha documents have been getting
about 30 hits a day since I posted them.  That is an indication that people
are currious and checking it out.  They would love to see charts and
graphs and real numbers and real code.  Some of them will want to try
it themselves too.  You might be suprised.  There have been a lot of
people asking about getting a stand alone Color Forth.  The problem
is that Chuck's isn't portable, isn't documented, and isn't available.
And it compiles itself so they can't compile it themselves.
I got a copy of the first stand alone Color Forth disk to try to my
machine because it happened to have an ATI graphics card. But it
would not boot.

So there might be quite a few people who would want to get a copy
of Flux.  You might be quite suprised by the response.  Now I know
the only other person who is working with F21 is interested in aha.
But there are a lot of people who have Pentium PCs who have
expressed interest in a stand alone Color Forth on diskette for
a PC.  They have wanted Chuck to deliver one, but Flux might
be better for several reasons.  It should be more portable.  
Chuck considered writing for generic VGA but went
for very hardware specific ATI graphics card details and then
upgraded to a newer 3D graphics card.  Flux might be something
more people could run, and it might easily be available before
Chuck gets around to releasing a version of his stand alone
Color Forth.  Then he has the Huffman encoded wierd character
set designed for OKAD etc. and you are using ascii.
 
> I am aiming to have Flux with sufficient tools running off the floppy within
> a few weeks so I can use it comfortably and productively with my laptop
> _without_ Windows on the harddrive!

PD is nice.  I have put the aha design document out in public as I wrote the
first code.  I figure I will include actual code at some point.  But there
is no F21 audience like there would be for a Pentium version.  I am not
focused on Pentium so I am very pleased that you are using Pentium but
exploring some of the same ideas that Chuck and I are experimenting with.

One idea to consider is to make ENTH and Flux low cost
products in the UT store.  Just to cover the cost of duplicating disks
and producing a little more user level documentation.  People might
prefer that over the stuff where people only document for their own internal
use.  With the exception of his presentations to FIG Chuck is
only documenting for his use or internal use at iTV.  I am the
person who has presented the info on the Internet.

Just a thought.  You certainly don't need to put Flux out as
something available in a store, or in the UT store but I would
want to put it there whether it is free or not.  But you
might like the idea.  I think some other people would even
if it is still mostly and experiment in progress.  That is
the nature of everything at UT more or less anyway.

But as I say we might kind of be suprised by the number of
people who might be interested.  If nothing else I would
like to make an html page at my site (with your permission)
about our conversations about Color Forth, aha, and Flux.
(nice phrase, I liked it more the second time. ;-)

the Fox

Subject: Re: Flux rewrite
Date:  Mon, 04 Dec 2000 19:58:01+1000
From: Jeff Fox 
To:  Sean Pringle 

Hi Sean,

Sean Pringle wrote:
> 
> Yes, I will have to think some more about more ways of compressing the
> source for Flux. Though I might run with this for now. After all, if I make
> the source yet smaller, all that will happen is the only first few tracks of
> all my Floppies will get worn out while the rest remain pristine!

:-)
  
> Tell you what! Flux uses a small fraction of the opcodes available for the
> Pentium. The rest I either don't understand (often!) or can get better
> performance from a few simpler ones in the right pipelines. As yet, I'm not
> convinced I ever want to be able to claim: 'I know the Pentium!'
> 
> >As Chuck said it works nicely when you >only have five bit
> >opcodes but turning an instruction wheel on a >Pentium does not work very
> >well ;-)
> 
> I think he might have that bit right :) I assume you've used FPC. It had a
> nice idea in the little window you could access at the terminal (by hitting
> an arrow key I think) that let you pick from a list of the most recent
> words. I found it useful at times.
> 
> >I see the fact that I can't use Emacs as a design win! ;-)
> >seen as a design win by me! ;-)
> 
> Amen. I like to think that what I am trying to write is a _stand alone_
> system. If I am using somebody else's text editor then he is defining my
> system, my comfort, the format of my source, the speed of my compiler ...
> and taking all my fun away at the same time!
> 
> >There is tick timer on Pentium.  It gives very precise timings of >code
> >sequences.
> 
> Ah yes. I think I even wrote some code to detect and run it when ENTH was a
> lad. I have a look :) Thanks.
> 
> >But as I say we might kind of be suprised by the number of
> >people who might be interested.  If nothing else I would
> >like to make an html page at my site (with your permission)
> >about our conversations about Color Forth, aha, and Flux.
> >(nice phrase, I liked it more the second time. ;-)
> >
> >the Fox
> 
> To tell you the truth, I've not thought of much beyond my own escape from
> Windows! But yeah, ok! There would certainly be good exposure at UT.
> Feel free to make the HTML page on these conversations. You'll have them all
> confused as I'm sure few have heard my name before!
> It makes UT a central repository as well which makes these ideas easy to
> find all in one place. I like the idea of the UT store though you might want
> to see if my code is worth a jot first!

Escape from Windows.  Sounds like John Carpenter movie.  Esape from New York.
Escape from LA.  Escape from Windows. ;-)
 
> Speaking of GUIs and stand alone systems before, I have some resonably
> reliable code to drive the floppy, keyboard, DMA, timers, IO etc. My Uncle
> works at a University and carries a copy of ENTH in his short pocket
> cunningly testing it on any unused (unwary?) machine :)
> The video for Flux is running in 80x25 or 80x50 VGA text modes. Up to the
> present, this has been fine and I can get enough colors for tokens like
> this. Do you know where I might get some information on driving the generic
> VGA graphics modes? My searching on the net must have been in all the wrong
> places. I understand you do some of this stuff with F21 boards and Chuck
> with his system? I wouldn't mind a simple GUI.

That sounds good.  More on that later, I have to go now.

Jeff