NAME

H3sm - Hohensee's 3-stack machine

DOCDATE

19991208-->20000710, H3sm version 1.3

UNDERVIEW; virtual machine description

H3sm is descended from the Forth programming language. It is first and foremost a demonstration of a new machine design, but is rapidly taking on practical abilities. The initial H3sm implementation has shell-like properties, and provides access to much of the functionality of the host operating system (Linux) via keywords for system calls, while being small and having no library linking. This explanation will take a from-the-machine-up approach, which I believe is the best approach to look at programming from, particularly in this case.

Most digital central processing units and programming languages for them have one stack structure that is implicit to the instructions of the machine. This stack structure is usually called the return stack. A stack is simply a LIFO, last in-first out. The last item placed on the stack is the item implicitly available to be taken from the stack. The typical return stack uses this characteristic to maintain coherence in a system that makes nested subroutine calls. The return addresses each subroutine returns to are "push"'ed and "pop"'ed to/from the stack by subroutine-calling operators, and the LIFO aspect of the stack keeps the proper chronology of what routine returns to what calling routine. The requisite PUSH and POP operations are inate to the programming language in question, or are inate to the subroutine-related instructions of the CPU in question. For example, the x86 "call" and "ret" machine instructions do push and pop operations on a return stack the CPU maintains in memory, in addition to thier effects on the instruction pointer. These instructions are arranged so that from the context of any particular subroutine, the item it will see on the top of the return stack is the return address of it's caller, so it knows where to jump to when it's done, with very little overhead.

A Forth, either in emulation on a one-stack machine or as a silicon Forth engine, has the same type of return stack for the same purpose, plus a separate data stack. Another example of a two-stack machine emulator is Postscript. Forths exist as emulators and as two-stack CPU chips. The author of Forth, Charles Moore, builds chips with two stacks plus an address register. H3sm has a return stack, a data stack, and a pointer stack.

One very confusing aspect of this to people familiar with arbitrary stack structures written in various languages is the fact that stack-machine stacks are not written in the language in question, they are it's structure. This is as different as a hammer and a nail, but which is the hammer and which is the nail can be hard to keep track of. The three stacks in H3sm are implemented by, and are implicit to, the instructions of H3sm. Other stack structures may be created, but are not inate to the instructions of the system. For example, H3sm's + operates on the data stack. You don't need any other instructions or qualifiers for it to pick the data stack; it's hardwired to it. For a stack you create using a programming language, as opposed to as part of the structure of the language itself, you have to write all the stack operators also, and reference those operators to the correct stack somehow. Avoiding this confusion is why I did the initial H3sm is C instead of Forth.

One undeniable aspect of stack machines is excellent code-density. This is why Java uses an emulation of a parameter stack on it's ruturn stack, although Java is not a 2-stack machine. On a true 2-stack machine the data stack provides a general and implicit place to get and put things between routines. A stack gives a more compact means of passing parameters than an array of registers. With registers, for each value to be passed between routines you have to specify which register holds the value, in the actual instructions of the machine. In Java the code-compactness of a stack is limited to within an object. The data stack of a Forth also provides added flexibility in how routines can obtain thier parameters, and from whom.

H3sm's return stack is basically the same as Forth's, which is not that different from most other languages. Forths don't usually do "stack frames" because having a parameter stack reduces the need to keep data on the return stack, at the cost of some "stack dancing". H3sm's pointer and data stacks are a division of the functionality of the Forth parameter stack. H3sm divides data into data and pointers. The pointer stack is composed of cells of the size of an address of the machine, 32 bits on x86.

The design of H3sm takes note of a fundamental difference between pointers and arbitrary data. Pointers are the same size as return stack cells, which are actually pointers also, but to executable machine code rather than arbitrary data. H3sm data items are variable-sized. The H3sm data stack has a "Size" register affiliated with it. The data stack is composed internally of bytes, but data stack items are manipulated in groups of bytes at the current Size. Most data stack operators are vectored by Size. For example, if the Size register holds 8, then "+" will add the top two data stack items of 8 bytes each, and leave an 8-byte result. Size may be from 1 to 256 bytes. An integer of some particular Size, i.e. a H3sm data stack item, is called a "pyte". I suspect there may also be advantages to the H3sm model vis-a-vis hardware designs with different sizes of data busses, and/or very wide data busses.

The H3sm pointer stack is actually more like the Forth data stack than the H3sm data stack is. This was a surprise to me as H3sm developed. However, logical flags for conditionals are typically kept on the H3sm data stack, and arithmetic operators for the pointer stack are simple address-arithmetic actions. This means that a Forth can not be implemented trivially with a subset of the H3sm instruction set. That means that H3sm is not a Forth, although I hope the family resemblance remains evident.

The sized pyte stack means arbitrary H3sm routines can be written without deciding until the invocation time of the particular routine how big an integer is. This also means that most routines can be used at 256 different integer sizes. A drawback of stack machines is what is known affectionately as "stack-dancing". Parameters are often not on the stack in the order a particular routine expects them, and stack manipulations simply to re-arrange may be required. This impinges on the programmer, and has adverse effects on low-level code. It is hoped that dividing the Forth data stack into addresses and data will reduce stack-dancing, or at least allow the programmer to get more done with the same gyrations. It bears mention that even given the low-level cost of stack-dancing, the information-hiding of a stack versus an array of registers is a win that well offsets the performance loss of stack-dancing.

INVOCATION

The default program name of H3sm is H3sm or H3m. Simply typing H3sm will put you in the interpreter. "quit" exits. H3sm takes all arguments passed to it by the operating system, i.e. it's commandline arguments, and interprets them as normal H3sm input; there are no commandline-specific "switches". In other words, the H3sm commandline is interpreted as regular H3sm interpreter input. As an example, here's a shell/H3sm one-liner to get a simply formatted H3sm words listing in conjuction with the unix fmt command. In this example "words quit" is interpreted as if run from the H3sm interpreter...

:;H3sm 100 words quit | fmt
0-terminate  woof  words  ->C  'p&  '&  '  (O)  tab  to4  to3  to2  to1
to:  token  this  snug  step  spaces  ;  resolve  refill  .name  .chars
.5  .string  .r  .p  p2+s  p2dup  open  name  -  link  ifbyte$=  if$=
ifpending  ifthreader  ifhexnumber  ifatom  ifabbreviation  ifword
interpret  ifchar  hexnumber  H3m  <-code  finish  dpdump  dump  cr
:  blank  bench  beep  ascii->digit  allot  abs  &p  &pliteral  &word
&literal  &  tb  toklen  thispointer  source  latest  hexdigits  epa  dp
digits  delimiter  buffer  block  <--THREADS  waitpid  write  unlink
truncate  symlink  signal  special  seek  stat  todir  resize  rename
read  quit  parent  pipe  pause  permit  nanosleep  mount  makedir  ioperm
ioctl  link  fork  dupFD  close  <--SYSCALLS  zero  yes  XOR  x  value
upshift  2pdrop  true  ->p  ->link  ->code  ->r  *  sub  !  s->p  s->r
setSize  swap  r->s  r->p  r->  rpcopy  rdrop  return  pup  pswap  p!
p->s  p->r  p->  p-1  p-c  p+c  p+1  p+s  p+  pover  +  pliteral  pdup  p@
pdrop  OR  one  onbits  over  no  negate  NOT  -?r  move  maskbyte  min
max  literal  ip+  ifpositive  if=  ifcontents=  if  H3sm  halve  gap
getSize  @+  @  false  ell  emit  doHNC  double  downshift  dup  drop
cells  bcellsd  bbytesd  byte/mod  bytes  bump  align  aint  AND  address
:; cLIeNUX /dev/tty3 r 15:19:53   /source/core/H3sm
:;

WORDS

Programming is done by defining new words, i.e. extending the dictionary using "defining words". The words keyword lists the dictionary from most recent word to "oldest". Words in the default H3sm dictionary are in four groups, and have no ordering dependancies within those groups. "Atoms" are the first group to be coded, and define the H3sm virtual CPU. Internally, the atoms are codes first, then the syscalls, and then the rest of H3sm is coded in terms of those words. Although the outer interpreter doesn't exist until H3sm is built, the inner interpreter linking of all words in H3sm is the same internally as it is to words the user may define later via the complete system. The main defining words in H3sm have the same names as thier Forth predecessors, : and ;.

atoms

primitives

Atoms are the atomic machine instructions of H3m the virtual machine. There are two kinds of atoms; primitives and syscalls. The primitives implement the H3sm CPU, and the syscalls provide peripheral services via the host OS. This is vaguely analagous to the distinction between C itself and libc, although H3sm doesn't use libc.

The initial H3sm implementation is on top of GNU cc, GNU assembler directives in C asm statements, and Linux, with some few x86-specific assembly language instructions where they can not be avoided. That is, the machine this H3sm runs in emulation on is the virtual machine implemented by GNU cc with GNU extensions to C, assembler directives, on Linux, on x86. The file "<C" in the H3sm source docs discusses the C/asm idiom used to be as un-C-like as possible.

H3sm abstraction
GNU cc---Linux abstraction
x86-hardware

There are five virtual registers in H3sm; ip, rsi, Size, dsi and psi. H3sm instructions, "words" in Forth parlance, operate on those virtual registers implicitly, explicitly, or implicitly on the three stacks they implement. ip is the instruction pointer, Size is the current effective size of data stack items, and dsi, psi and rsi are the data, pointer and return stack indexes. Explicit access to the stack indices is not provided. There are therefor no DEPTH equivalents. I use "stack index" rather that "stack pointer" to refer to the three main ones, since H3sm has an overabundance of things that start with p.

ip, rsi, dsi and psi are address-cell sized ints. Size is a byte. Explicit operators are limited to the Size register... setSize, getSize, double, and halve. The rest of the words in H3sm have various implicit effects on the H3sm virtual registers, and on the contents of the stacks.

syscalls

I assemble a Linux/GNU/unix "distribution", cLIeNUX. The H3sm demo implementation assumes a Linux host. H3sm has a number of Linux syscalls implemented as atoms, but they are sequestered somewhat in the dictionary. Host services that are necessary to have a H3sm at all are the ability to read and write to the console display and keyboard, and to exit H3sm. That is accomplished with syscalls in the H3sm atoms emit, readstdins and quit. Porting H3sm to other than x86 Linux would involve modifying at least those atoms.

H3sm does not use libc. H3sm is intended to be distinct from C, so libc is avoided, so that H3sm can be as self-sufficient as possible. Syscalls are hardcoded into atoms and syscall words. The syscalls I will include tend to be the ones of use on a single-user Linux box, and may not include all the fancy access controls available from a unix. Otherwise, H3sm provides a powerful set of host functions, and does so in a very small executable for a stand-alone written-in-C program.

threads

general

The atoms define the fundamental H3sm virtual machine; the H3sm CPU, so to speak. The syscalls extend that definition to include Linux host facilities, providing the H3sm CPU with peripherals needed for a useable system. Above the syscalls in the dictionary are thread words. Thread words are implemented using the virtual machine thus defined, entirely in terms of the instructions and level of abstraction thus defined. Thread words are strands of instructions and data for the H3sm virtual machine, and the C level knows nothing about them other than that they take up space. The main jobs of the H3sm built-in threads is to implement an interpreter, an interpretive thread-creation mechanism, i.e. a Forth-like "compiler", and some H3sm self-investigation tools such as words, dump and dpdump. To the extent that a Forth definition threader is a `compiler' the H3sm threader is more like an assembler.

data

The thread words are divided into action threads and data words, but thier implementation is similar. Since threads are written with H3sm atoms, the sourcecode for the built-in threads in the H3sm C source is similar to using atoms and threads interpretively. Have a look. The sources are heavily commented. The style of some of the built-in threads is very bad, due to the fact that the code maintenance features of a complete H3sm don't exist at that point, but the use of H3sm atoms is well illustrated. Threads internal to this H3sm implementation are written entirely in terms of the H3sm atoms and syscalls, to illustrate the viability of the VM. In general, the H3sm sources are very tricky as to the processing passes made to build H3sm and what's going on at a particular time in the build process, but the process of H3sm building itself from C/asm to primitives to threads is quite clear, since the <C coding techniques used mimimize unnecessary C code.

FLOW CONTROL

H3sm allows a new word to implement flow-of-execution controls for itself using simple branches and labels. The provided flow-controls are rustic, but complete. To the extent that the Forth threader is a compiler, the H3sm threader is an assembler. There is a resolver for branches in ;, which wraps up a new definition. The branch words are to:, to1, to2, to3 and to4. Branch targets are labeled by :, ;, and (O). : is target 0, the first (O) in a definition will be target 1, and so on. Then there is an implicit target also at ;, which will be the target number one greater than the number of (O)'s in a definition.

That's right. Just Goto, and a tiny number of branch targets to go with it. I'm more curious about what can be done with simple constructs and utilities like "see" than I am about flow control abstractions, and I find a goto and a target to be unambiguous. H3sm provides the tools and materials to build other constructs. There's also an execution array primitive, but I haven't used it.

Other stunts are possible, actually. In H3sm you can manipulate the return stack, and you can therefor use sub and return in ways not implied by thier use in the inner interpreter.

The to# words are the gotos. An endless loop would be something like...

		: eloop tab ell to: ;
	
to: in the above will resolve to the address just after the header, so eloop when invoked will produce tabs endlessly. Conditionals can be constructed by following "yes" and "no" with to#'s.

This one is a decremented loop, 0x10 (decimal 16) iterations...

		: do10 10 ->r  
			(O) 
				44 emit -?r
			yes to1 
		rdrop ;

	
The branch address cell is specified by to1 as the address of the word following (O) . That is, a to# gets resolved to an address cell, a `target' (O) doesn't represent any space in the resulting thread, but the address of the target gets resolved to the contents of the corresponding to# cells. You can use 4 targets, not counting target 0 set by :, and sixteen to0's, to1's etc. ; sets a target also.

Using to: is not recursion. It's just a branch to the front of the word. You can call yourself if you want though, since a header is complete by the time the rest of the threading process gets under way.

		: recurse recurse ;
		recurse
	
should give you a nice quick segfault, for example.

ABBREVIATIONS

H3sm allows use of a period to abbreviate keywords, interpretively or in a colon definition. The H3sm word to quit H3sm is quit. There's no other words starting with "qu" in the base system, so q. is valid for quit also. I've left an unabbreviator definition as an exersize to the user. ' [some abbreviation] dumpline is one way to see what a particular abbreviation will interpret as, if anything.

ifabbreviation is always called after ifword, so actual definitions ending in ., if any, take precedence. The search ifabbreviation uses always runs to the base of the dictionary, so that ifabbreviation always finds the oldest word that the given abbreviation is a possible alias for. This means that core words will have the short abbreviations, and they won't be superceded by later definitions. (Thanks JET.)

GLOSSARY

The Glossary script in the H3sm sources generates a glossary of H3sm words from the comments in the main H3sm sourcecode file. Forth-style stack-effect comments have been adapted for three stacks. This is an annotated version of what the Glossary script produces. Note that H3sm, like Forth, makes no attempt to distinguish between keywords and "punctuation".

 	glossary includes all words, including NON_USER words "words" hides.

	H3sm abbreviation pronounciation  conventions:

	@       !       ->      .       ?       &       '       $
	fetch   store   to      print   ask     append	tick    segment

It is hoped that this will help for derived names like, uh, .@p+c . H3sm is case-sensitive. Booleans OR, AND, XOR and NOT are uppercase. You fetch/store memory, you get/set a virtual register. The size Register is "Size" or s in an acronym. "fetch" and "get" are copies, "->" is a pop usually, source is dropped. & in stack comments, e.g. &HNC, sometimes means "address of".

ATOMS

  • address
    (P --- ptr ) !ip !TOR
    Basic action of a data word, a noun. `address' does an in-line return to it's calling thread, making nouns threads in terms of doHNC/VMST/call-threading. A data word can be defined thusly... : mydata address ; The execution token ; stores for `return' constitutes a default allotment of one cell for data, and is otherwise not used, since address does a return itself. The word `mydata' has an action analagous to a Forth word defined by VARIABLE.
  • AND
    ( pytea pyteb --- aANDb )
    The usual bitwise Boolean.
  • aint
    ( flagp --- !flagp )
    Casual logical NOT, done bitwise on lsB, which in effect quickly inverts a flagpyte.
  • align
    (P a --- a_rounded_up_to_ACELL)
    round an address up to the next even multiple of system cell size.
  • ?contents=
    (P &a &b ||| ) ( --- flag )
    `ask contents equal'. equality comparison of the contents of two addresses.
  • if=
    ( a b --- flagpyte )
    Compare two pytes. Produces a valid flag, a "flagpyte", a one-byte true/false in the lsB of the pyte. Faster than - .
  • -?r
    ( --- flag ) (R i --- i-1)
    "minus ask return". Decrement TOR and then see if it's zero. This is for fast 1-decremented loops, and came about while tweaking "bench" to at least be faster than Perl. It seems valid, design-wise, since at that time there were about 5 occurances of "?r" in the interpreter, and they were all next to occurances of "-r".
  • ?r=
    ( val --- flag ) (R count |||)
    "ask-r-equal" is TOR = val?. For generally additive or subtractive loops using top of return stack for thier counter, but a count increment other than one.
  • ?positive
    ( a --- flag )
    "ask-positive" consumes a, 0 is considered positive.
  • bump
    ( --- junk )
    undrop. Much annoyance in the implementation of the interpreter to let this work. Being able to sensibly access things above TOS comes in handy with pytes, but may be un-tenable as simple hardware. See "*".
  • bytes
    1 -> Size
    Set Size to one, bytes.
  • byte/mod
    ( dividend lsB-only_divisor --- quotient modulo )
    "byte-slash-mod" For multi-base number input/display, which doesn't exist as of this edit. Submissions welcome :o)
  • bytes_bd
    bump 1->Size drop
    You have to wrap a Size change with a bump and a drop if you want a new section of stack that doesn't damage the underlying stack items. This does that for you as an atom, because it's hairy, or perhaps impossible, as a thread. The bump/drop format is because the data stack is increment-before-push, i.e. dsi is left pointing at an occupied cell.
  • cells
    sizeof(int) -> Size
    sizeof(int) is 4 on x86.
  • bcellsd
    bump sizeof(int)->Size drop
    You have to wrap a Size change with a bump and a drop if you want a new section of stack that doesn't damage the underlying stack items. This is because the H3sm data stack is increment-before-push; the current value of the data stack index points at an existing TOS, not a vacancy. This does that for you as an atom. This is a complication of variable-sized data that I didn't foresee, but it seems liveable with the right wrappers. See bytes_bd.
  • doHNC
    (P &HNC --- ??? ) (R; --- RET|null) ( --- ???) !????
    `do H N C' Do a word given address of HNC. HNC, Head Name Cell, is a namelength byte, an atomic flag, and other flags. It's the first cell in a definition header, and it's what the previous word in the dictionary's link field points at. doHNC is Forth EXECUTE, with an in-line ->code and atom/thread differentiation. It's ugly, but it's entirely in the outer interpreter. The namelength byte is the least significance byte of the HNC, which has the same address as the HNC on the little-endian x86.
  • double
    up-roll Size by 1, > 256 wraps
    Double the value of Size, or 256 wraps to 1. No sign-extend or anything. Normal effect is to double Size.
  • drop
    ( pyte --- )
    Drop a data stack pyte.
  • dup
    ( pytea ||| --- pytea ) i.e. ( a --- a a )
    `dupe' Duplicate a data stack pyte. Unix/Linux "dup" is dupFD (file descriptor).
  • ell
    unconditional branch to contents of next thread cell
  • emit
    (pyte --- )
    ascii print lsB. Could be Unicode without much bloodshed.
  • false
    ( --- 0flagpyte ) fast false, on lsB of pyte
    Push a pyte onto the data stack with an lsB of 0.
  • @
    (P ptr ||| --- ) ( --- pyte ) fetch pyte at ptr at Size
    "fetch" a pyte from an address pointed at by TOPS.
  • @+
    (P ptr --- ptr+Size ) ( --- pyte )
    "fetch-plus" Chuck recommends @+. I use it in "dump" and friends.
  • getSize
    (P --- Size )
    sizeof(int) bytes moved, extra 0-filled
  • flag
    ( pyte --- flagpyte )
    Bytewise-OR a pyte into it's lsB. Conditional branches only look at the least significant byte of TOS, so if you want to use an arbitrary pyte as a condition when Size is greater than 1 you have to do this to the pyte first. "?" words, "ask" words, like "?positive" produce flagpytes, which you don't have to flag.
  • gap
    ( --- a-b ) (P a b ||| --- ) bytes between addresses
    Address subtraction; operands on pointer stack, result on data stack.
  • halve
    Size is halved, or 256. !SIZE
    Reduce Size by a factor of 2, or wrap 1 around to 256. 0 Size is impossible.
  • ip+
    ( ip_index --- ) case-construct mechanism, computed goto
    This is for execution arrays. I haven't used it yet. When I do it may well be more of an indexed gosub than an indexed goto.
  • literal
    ( --- pyte ) in-thread literal number
    This is the action of a number in an input stream, which makes H3sm RPN and allows it to not have a default "syntax". This IS the syntax, "Words happen as soon as they are tokenized, numbers go on the stack (, files get opened. RSN.)"
  • maskbyte
    ( a --- a&0xff ) mask off all but lsB
    Much used.
  • max
    ( a b --- maxab )
  • min
    ( a b --- min_ab )
  • move
    (P from to ||| ) ( cnt --- ) move cnt bytes
    semi-smart. Doesn't clobber overlaps regardless of which of "to" or "from" is higher.
  • NOT
    ( pyte --- !pyte ) bitwise NOT of all bits
    For some reason I can't really express I like the basic Booleans in all-caps.
  • negate
    ( a --- 2's_complement_negative_a )
  • no
    ( flagpyte --- ) branch if 0 flagbyte
  • H3sm
    no action other than NEXT. H3sm no-op. Simplifies using the commandline.
  • onbits
    ( --- -1 ) all_ones pyte constant
    pyte constants are faster than literals, and a bit less bug-prone during development, it seems. (pyte literals aren't simple.)
  • one
    ( --- 1 ) pyte constant
  • OR
    ( pytea pyteb --- pyteaORb ) bitwise Boolean OR
  • over
    ( a b --- a b a )
  • pdrop
    (P ptr --- ) decrement pointer stack index
    H3sm's pointer/address words typically don't do thier own pdrops, so this is a VERY common atom. This is a design issue that's still in flux. Aren't we all?

    This brings us to a raft of p-words. I'm not real uncomfortable with the number of atoms in H3sm, given that they do a lot, and given that I got "bench" down to (very non-rigorous remark-->) twice the speed of Perl.

  • pdup
    (P a --- a a ) duplicate a pointer stack value.
  • p@
    (P ptr1 --- ptr2 ) ptr1 overwritten
    This is one of the few pointer ops that consumes a pointer by itself. In this case there's no net change in the pointer stack index.
  • +
    ( a b --- a+b ) 2's complement addition.
  • p-
    ( pyte --- ) (P a --- a-intpartofpyte )
  • p+
    ( pyte --- ) (P a --- a+(int_part_or_less_of)pyte )
    "pea plus" These, p+, p+s, p+1, p+c, p-s, p-c, p-1... are address arithmetic, not contents arithmetic. "gap" does an address difference calc.
  • p+s
    (P a --- a+Size ) "pea plus ess"
  • p+1
    (P a --- a+1 ) "pea plus one"
  • p+c
    (P a --- a+cellsize ) "pea plus sea"
  • p-s
    (P ptr --- ptr-Size ) "pea minus ess"
  • p-c
    (P a --- a-cellsize ) "pea minus see"
  • p-1
    (P ptr --- ptr+1 ) "pea minus one"
  • p->
    (P a ||| ) ( --- a_low[a_hi] ) 4 bytes moved regardless "pea to"
  • p->r
    (P a ||| ) (R; --- a ) "pea to are"
  • p->s
    (P Size --- ) !Size superceeded? "pea to ess"
  • p->s_bd
    (P Size --- ) bump p->Size drop pdrop. si = AboveTOS-newSize "pea to ess bee dee"
  • p!
    (P p store ||| ) p goes in store, no pdrops "pea store"
  • pswap
    (P a b --- b a ) "pea swap"
  • pup
    (P --- oldptr )
    "pea up" pointer stack undrop. Not known to work.
  • .
    ( pyte --- ) hex print pyte
    Drop TOS and display it in hexadecimal.
  • rdrop
    (R; a --- ) Don't forget this one after a ->R.
  • return
    (R; xt --- ) VMST rts to calling thread
    reverse of `sub'. `address' does a return internally.
  • r@p
    (R; a ||| ) (P --- a ) dup r to p "are fetch pea"
  • r-1
    (R: val --- val-1 ) decrement rsi "are minus one"
  • r->
    (R; val --- ) ( --- val ) Forth r>, 4 bytes written to ds "are to"
  • r->p
    (R; ptr --- ) (P --- ptr ) symmetrical with p->r.
  • r->s
    (R; size --- ) !SIZE "are to ess"
  • rup
    (R; --- previous ) undrop rs "are up"
  • setSize
    (P Size --- ) set Size
  • s->r
    (R; --- Size ) "ess to are"
  • s->p
    (P --- Size ) "ess to pea"
  • !
    (P ptr ||| ) ( pyte --- ) pointer is not dropped "store" a pyte. Memory is clobbered at Size.
  • sub
    (R; --- xt ) enter a thread, VMST jsr
    This is the atom cell that must preceed a thread's address in a calling thread. This is the difference between H3sm's Virtual Machine Subroutine Threading and other better-known schemes. "sub" is a GOSUB or JSR basically. The dictionary is thus larger, but NEXT doesn't need to process a "working variable". EXECUTE (doHNC) is thereby complicated, but mostly in the outer interpreter. I find this easier to follow than more compact address-interpreter schemes.
  • swap
    ( pytea pyteb --- pyteb pytea )
  • *
    ( a b --- lo_a*b [hi_a*b] ) hi a*b is Above TOS
    `times'. Overflow is available (immediately after *) in the pyte above TOS via bump, etc.
  • ->r
    (R; --- val ) ( val --- ) Forth >r, ish. 4 bytes moved. `to are'
  • ->code
    (P &HNC --- Code_Body_Cell )
    `to code', traverse a word header from HNC to the code cell of the word.
  • ->link
    (P &HNC --- Link_Cell )
    `to link', traverse a word header from HNC to the link cell of the word.
  • ->p
    (P --- ACELL ) ( a [a'] --- )
    `to pea'. cellsize snagged regardless.
  • true
    ( --- true_flagpyte ) flagwise -1, xxxff Fast true flag.
  • ushift
    ( shiftee amount --- shifted ) up-significance bitshift shift, not roll.
  • XOR
    ( pytea pyteb --- pyteaXORb ) Boolean bitwise exclusive-OR.
  • yes
    ( flagpyte --- ) conditional branch if non-0 see `no'.
  • zero
    ( --- 0 ) 0 as a pyte constant fast 0-pyte.

    syscalls

    See the cLIeNUX seedocs/Linux manpages for more
  • become
    (P path argv envp ||| ) ( --- GONE|fail=-1) unix execve
    Become another process. A unix process runs another process by becoming the other process. If you want the parent process to continue you have to fork the parent first. "become" and "fork" are fundamentals of process invocation, derived variants exist also.
  • chroot
    (P path |||) ( --- ret)
    Change the apparent root directory of a child process.
  • close
    ( --- ret ) (P name ||| ) lose a FD
    Relinquish access to a file by removing it from the processes file table, thus releasing it's file descriptor. The FD is then vacant. Open files are a finite resource in Linux.
  • currentdir
    (P path --- ) (--- ret) unix chdir, cd
    name still in flux.
  • dirents
    ( FD count --- red|-1|0 ) (P buff ||| ) unix getdents
    Get some directory entries from a FD for a directory. Linux ext2 directories are not files.
  • dupFD
    (old_FD --- new_FID|-1) unix dup
    make 2 FDs represent the same open file. Can be used with "close" to perform file IO redirection internally to a process, using the knowledge that the next FD used is always the lowest one available.
  • FDcontrol
    ( FD CMD --- ret ) (P [arg] |||) unix fcntl
  • fork
    parent..( ---PID_of_child_or-1) child..(--- 0)
    Process miosis or mitosis or whatever. A process clones itself. The return value is the only difference between the two new processes, and is how the parent and child each figure out which one they are, if necessary.
  • fflush
    ( FD --- flag ) unix fsync
    flush a file's changes to non-volatile store.
  • flush
    ( --- 0 ) unix sync
    flush the entire unix buffer cache to non-volatile store.
  • gettimeofday
    (P timeval |||) ( --- flag)
  • grow
    (P end_data_segment ) ( --- ret ) unix brk, basis of malloc
    request a change in the size of a processes memory allocation.
  • hardlink
    (P old new ||| ) ( --- flag ) unix link
    Make another name for a file. The new name is co-equal to pre-existing names. In other words, a file with one name has one hardlink.
  • ioctl
    ( FD ioctl --- ret) (P argp --- )
    IO control. The great unix wart, for every file that isn't really a file, especially your terminal, there are loads of special ioctls.
  • ioperm
    ( level --- flag ) Linux iopl
    64k IO-space access control, for Microchannel XGA video, e.g.
  • makedir
    (P name ||| ) ( --- flag) unix mkdir, perms=0
    "permit" your new directory separately. This is a simplification of the unix version, which does take a permissions argument.
  • mount
    (P special dir fstype|NULL data |||) (rwflags --- flag)
    include a filesystem into the one unix filesystem on a mountpoint directory. The loopback switch is, uh, uh,,,
  • nanosleep
    (P timespec remaining ||| ) ( --- flag )
  • nice
    ( int --- flag ) request a priority
  • open
    (P name |||) ( flags perms --- FD ) get a FD for a file
    establish access to a file by name, and obtain a file descriptor for your interface to it.
  • owner
    (P path |||) ( UID GID --- ret) unix chown
    change the owner of a file.
  • pause
    ( --- -1 ) wait for a signal
  • pipe
    (P ptr_to_int[2] |||) ( --- flag) make in/out FDs and 10k buffer
    This is the underlying mechanism of an unnamed pipe, such as in a shell pipe one-liner. The actual pipe buffer is in the buffer cache. Example: cat bla | sort | uniq
  • permit
    (P path |||) ( mode --- ret) unix chmod
    change the permissions bits of a file.
  • parent
    ( --- parent_PID ) unix getppid
    Get the Process ID of this process's parent, the one that forked this one off.
  • quit
    unix exit(), Forth bye, "q." works in H3sm.
  • read
    (P buffer |||) ( (byte)FD --- #read )
    H3sm "read" always requests BUFFERSIZE bytes.
  • remove
    (P name ||| ) ( --- flag) unix unlink, rm
    remove a filename. IF it's the last name for a file, remove the actual file (i.e. de-allocate it's inode).
  • rename
    (P old new |||) ( --- flag ) mv
    mv is actually a rather inaccurate name for this.
  • rmdir
    (P name ||| ) ( --- flag ) remove empty dir
    The rmdir call only works if the dir is empty.
  • seek
    ( FD o/s whence --- o/s0 ) unix lseek, Forth reposition-file
    Move a file position pointer. "whence" can be relative to current, the start of the file, or I think the end of the file.
  • settimeofday
    (P timeval |||) ( --- flag)
  • sigaction
    ( signum --- flag ) (P action old |||) install signal handler
    "action" is an address of machine code to run on arrival of signal "signum". I haven't figured out how that pertains to H3sm threading yet. I want to, because handling SIGCHLD is how unix init works, which H3sm might make an interesting alternative to.
  • socketcall
    ( call --- flag? ) (P args ||| ) entry for all socket "calls".
    This is the only actual Linux syscall for socket ops. "call" is how you get the actual action of "connect" and so on.
  • stat
    (P name buff ||| ) ( --- flag) get a file's stat struct
    Get a bunch of info on a file.
  • statfs
    (P name buff ||| ) ( --- flag) get a filesys stat struct
    Get a bunch of info on a filesystem.
  • settime
    (P timestruct ||| ) ( --- flag) set sec. elapsed in epoch
  • swapoff
    (P name ||| ) ( --- flag )
  • swapon
    (P name ||| ) ( --- flag )
  • special
    (P name |||) ( mode dev --- flag ) unix mknod
    Make a device special file.
  • signal
    ( PID signal --- ret) unix kill
    Send a signal to the process with process ID "PID".
  • symlinks
    (P name buff ||| ) ( bufsize --- flag ) unix readlink
    Get the name this name is an alias for.
  • symlink
    (P exists new |||) ( --- flag )
    Create a filename alias for a file. Symlinks can be on separate filesystems. They are just a text alias, no inode linkage at all. That's why there's "symlinks", that's the only way to access the real file.
  • syslog
    ( type len --- flag ) (P buf ||| ) read/tweak kbuffer
    The kernel has a text buffer. That's what the boot messages are from. This call accesses that buffer.
  • truncate
    (P fname || ) ( len --- flag ) truncate fname to len
    This is mostly used to wipe a file to zero sizeon a clobber, but it may be a useful stand-alone.
  • trace
    ( request PID addr data --- flag) unix ptrace
    basis of gdb etc.
  • unmount
    (P name ||| ) ( --- flag) unix umount.
    Unmount a mounted filesystem by mountpoint directory name or device special file name.
  • waitpid
    (pid options --- ret ) simplified
    Sleep until child "pid" exits or some other special circumstance wakes us up.
  • write
    ( FDbyte request --- red ) (P name ||| )
    append data to a file from it's current position pointer.

    threads

    named data

  • this
    (P --- &bpa )
    points at first un-parsed character in parse area.
  • buffer
    (P --- &buffer)
    H3sm 1.3 interpreter input is buffered here.
  • block
    (P --- &block ) general purpose buffer(s)
  • branchstack
    space for 16 branches to resolve
    used by the threader for branch resolution
  • digits
    (P --- address_of_array ) ascii vals to hex vals map
  • dp
    (P --- dictionary free point pointer )
    contains address of next free heap byte. This is where new words go. You can dpdump it.
  • epa
    (P --- &epa )End Parse Area. Set up by refill.
    converse of this
  • hexdigits
    (P --- &array ) array of "ascii val is/not a hex digit" flags
  • tb
    (P --- &tbuf) interpreter token buffer
    See also, this, epa, refill
  • latest
    (P --- &HNC ) &HNC of most recent H3sm word definition
    All word lookups start here.
  • pvar
    (P --- &pvar)
    a scratch pointer variable.
  • toklen
    token length
    Contains an int giving the length of the current interpreter token.
  • tsi
    counter for branches to resolve
    Used by the threader for branch resolution.
  • USize
    (P --- &USize ) offstack Size store for interpreter

    action threads

  • abs
    ( a --- |A| ) absolute value
  • allot
    ( o/s --- ) move dp up o/s bytes.
    iffy > 256
    Forth ALLOT or GNU as .org basically.
  • &
    `append' ( a --- ) append a into dictionary at &dp at Size
    All the `append' words do some kind of dictionary appending. They all change dp. I have chosen & instead of Forth's , (comma) because that implies and is pronounced `compile', which is misleading. Plain & appends a pyte. That does not by itself create a working literal. See the following.
  • &lit
    ( # --- ) !dp
  • &p
    (P bla ||| ) append bla to dictionary !dp
  • &word
    (P HNC ||| )
    used by the threader.
  • ascii->digit
    ( --- nybble ) (P char ||| )
  • ?abbreviation
    ( --- flag )
    This happens after full word lookup failure if a token ends with a period.
  • ?file
    ( --- flag )
    this is a stub pursuant to the as yet unimplemented dofile
  • ?atom
    (P HNC |||) ( --- flag ) true if HNC is an atom
    needed for H3sm's current threading scheme.
  • ?char
    (p; str --- ) ( --- flag ) true if char > 32
    This implements a unixy general whitespace, so that argv[] can be passed straight to the interpreter, for example. Mitch Bradley recommended this for Forth on unix in 1985.
  • ?hex
    ( --- flag ) (P char ||| )
    used by the interpreter. Not as useful if we get real number bases.
  • ?hexnum
    ( --- flag ) @'s toklen and itoken
    see above
  • ?segments=
    (P beg. beg. ||| --- ) ( len ||| --- flag )
    string-compare. But strings can be anything, i.e. "segments" of memory. Note that these are sized strings.
  • ?word
    (P --- HNC|null) ( --- flag ) Forth FIND
  • beep
    emit a 7
  • bench
    bogoloop, 100000 empty iterations.
  • blank
    emit a blank character, a space
  • :
    ": newname [tokens...] ; " define a new thread word
    `colon'. Analagous to the Forth :. Create a new word header, begin the threading process, to be terminated by ; . It goes like this...
    
    		A word header data structure is built
    		it is linked into the dictionary, i.e. it is made `latest'
    		the branch resolution data stuctures are wiped/initialized
    		branch target 0	is set to the following cell,
    			i.e. the CFA of the new word
    		enter threading mode, and continue parsing and
    			threading until a threading word (like a Forth
    			immediate word) says to stop. Normally this
    			will be  ;.
    	
    That setup means the new word is live when : itself is finished, so recursion is trivial. Setting target 0 means to: will resolve to the CFA of the word. target, to1, to2, ; and so on have a bit set in thier headers saying "I am a threading word". This is pretty much the same as Forth "immediate" words, but there's no STATE variable; threading words are assumed to produce a "keep going" flag, which is false for ;. H3sm threading words all have actual actions in any context, although the action of a threading word when not threading is usually degenerate.

    I haven't looked at equivalents of Forth's CREATE and DOES> yet, or whether they are needed given direct access to `address'.

  • cr
    emit a linefeed/carriage-return
  • dofile
    cat or execute a file
    not implemented yet. This is not LOAD or similar. This will be something like regular unix shell activity when a filename is encountered. For example, the default action of a token that resolves to a directory name will probably be to cd to that directory.
  • doword
    (P HNC --- ???) (???) (R ?????)
  • dumpline
    (P a --- a+16 ) display 16 bytes from a
    part of dump and dpdump
  • dump
    (P start --- start+256 )
    Ye Olde Be-Lov'd Hexdump, hardwired to 256 bytes. It leaves the address of the next page for continuous up-addresses dumping.
  • <-code
    (P CFA --- HNC) find HNC from the word's CFA
    used by "see", which doesn't exist as I write this.
  • to:
    ( --- keep_threading) address of CFA of current def. !dp
    Threading word, is resolved to a cell containing the address of the CFA of the current word being threaded. This resolves a backward branch to the start of the word. This is a branch, not a re-entry/recursion.
  • to1
    ( --- keep_threading) address of first `target' in defin.
    Threading word, is resolved to a cell containing the address of the first branch `target' marker (O) in the word currently being threaded.
  • to2
    ( --- keep_threading) second target !dp
    Threading word, is resolved to a cell containing the address of the second branch `target' marker (O) in the word currently being threaded.
  • to3
    ( --- keep_threading) third target gets ,'ed !dp
    Threading word, is resolved to a cell containing the address of the third branch target marker (O) in the word currently being threaded.
  • to4
    ( --- keep_threading) "to;" is to(max( (O) ) + 1)
    Threading word, is resolved to a cell containing the address of the third branch target marker (O) in the word currently being threaded.

  • header
    "header newname " make thread header for newname
    part of :
  • interpret
    evaluate one token
  • hexnumber
    ( --- # )
  • ihexnumber
    ( --- # )
  • initresolve
    wipe threading branch-resolver stack indices
    : does this.
  • link
    !dp !latest link a word to the dictionary
    link'ed word is now "latest"
  • -
    ( a b --- a-b ) minus
  • dpdump
    (P --- dp+0xc0 ) dump from dp - 0x80
    handy dump variant that frames dp.
  • p2dup
    (P a b ||| --- a b ) dup 2 pointers in phase
  • p2+s
    (P ptr ptr --- ptr+s ptr+s )
  • .p
    print pointer non-destructively
  • .ps
    hex print pointer stack non-destructively
  • .r
    print TOR nondestructively
  • .s
    (P begin ||| ) ( length --- ) print a string
  • .chars
    (P begin ||| ) ( length --- ) print bytes=126 > chars > 32
    print "printable" text glyph chars only.
  • .name
    (P HNC ||| ) print a word's H3sm name
  • readblock
    ( FD --- red ) get 1k of file into block
  • resolve_one
    bsi is double-decremented. poke a "there" into a pokeloc
    part of the following. This ties a target to a go#.
  • resolve
    (---keep_threading) @bs @ts, fix pending flow-control branches
    used by ;, see the following. As Forth has a "compiler", this is H3sm's "assembler".
  • ;
    ( --- false) end a new thread
    Finish a : definition. This does the flow-control branch resolution and appends a return. The false flag tells : it's done.
  • tab
    emit a tab
  • target
    set a point in a : definition as a branch target.
    There are 5 available, 0 thru 4. Target 0 is set to the words beginning by : . to0 , to1 , to3 etc. will be resolved to the addresses of the various targets by ; . go0 etc. are just the addresses; you still need ell, yes, no etc.
  • thread
    parse following token, make a thread header for it
    Used by :.
  • threading
    (P HNC --- ) mark word as threading word
    not implemented yet. Not much different than IMMEDIATE but I prefer this name.
  • threadtoken
    ( --- keep_going? )
    thread a word or number.
  • '
    (P --- HNC|null) (--- flag) Forth "tick", parses next word
    gets the HNC of the word following it in the input. If ' is used in a definition, it gets the next word at runtime, NOT the next word in the definition. There are variants for that. See the following.
  • '&
    "'& word " `tick append' threading-time HNC getter.
    Like Forth "[']" I believe. STUB.
  • token
    ( --- End_Of_Input ) !tb !toklen
    get a unix-whitespace-delimited sequence of characters from the input buffer. Flag true if input is exhausted.
  • words
    ( count --- ) print count wordnames from latest
    Note that unlike Forth WORDS, H3sm `words' take a count. Overflow is OK. I will be doing a "hide" word that will cause words to skip hide'd words.
  • The glossary your sources produce will be more accurate. Some of the words in glossary are not visible in the output of words. H3sm allows hiding words from words so that words of most interest to the particular user can be emphasized. This also means that the difference between the number you give to `words' and the number of wordnames it prints is the number of hidden words in that range of the dictionary.