[Image: GNU Sather logo]
(Answer) (Category) Sather FAQ : (Category) Developing the Sather language and libraries : (Category) Waikato University Efforts : (Answer) Highlights
From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Progress down under
Date: Thu, 27 May 1999 19:46:34 +1200
Message-ID: <4908aa6d48asgard@waikato.ac.nz>

I must apologise to those to whom I indicated that we would be putting 'pen
to paper' to let you into what will be/is/has been going on down here -
other teaching commitments have made life a bit tricky.  However I have
nowproduced three articles (and a couple of spreadsheets) which will appear
separately.

To those who are unaware of what we have been doing I have written a 'New
Image' article to give you background to the scheme of work here, why and
for what reason, etc.

Our work has almost got past the critical stage where we can release some
beta-test code - we intend to be very thorough, however, and test
everything until we all have the proverbil blue faces!  Don't therefore,
please ask for code until we can give a better forecast.  The new Required
Library, Required Resources and boot compiler is unlikely to be available
before August - given the size of the alpha testing task.

Irrespective of what we have done here there has been absolutely no change
to the Sather language as specified in the existing documentation - except
for making visible the 'bit' class which was discussed in the original
documentation - and which we have found to be essential to meet our
objectives.

Have fun reading!

kh

-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Subject: Sather - a New Image?
Date: Thu, 27 May 1999 19:50:56 +1200

Original Ideas
--------------
In order to carry out some research into the problems of
internationalisation and localisation of software we required a high
quality clean, safe O-O language.  We found Sather which, at first sight,
seemed adequate to our needs.
When we began serious evaluation we discovered a number of significant
short-comings in the confusion between language specification and
implementation.  Study of some of the early notes revealed that the
original concept was to define immutable objects using AVAL - specifically
AVAL{BIT}.
Since we required a program which could handle multiple character encodings
in the same program - encodings which could vary from 8 to 32 bits for one
encoding and from one to six encodings per character, we obviously had to
carry out some re-design work on both compiler and library. We therefore
needed to look clearly at every form of conversion between internal values
and external representations (whether text or binary).

The Plan
--------
In the course of doing this we discovered that there were no facilities for
input/output of binary data except as pseudo-characters which, because of
our requirements were no longer 'simple' 8-bit 'codes there are even
bit-patterns which are not valid character encodings!
We further realised that the existing class INT actually incorporated four
incompatible groups of semantics - those for integers (the abstract number
set Z), those for cardinal (the abstract number set N), those for closed
field arithmetic (a Galois field as used in the C programming language) and
- believe it or not - bit_pattern manipulation!
After assessing all of these (and a few other) factors we realised that in
order to use Sather in the way we desired that a considerable part og the
distribution library would need revision and that it would need
considerable extension in relation to strings, binary (bit-pattern)
handling and those features already in the course of being standardised
(such things as money, time and dates). Naturally some modification to the
compiler would be necessary to modify/add built-in features in accordance
with the altered requirements - in practice of course the compiler will
eventually need to be written in order to take account of the new
internationalisation features and, itself, make use of the new library.

A New Compiler
--------------
It was realised that the compiler back end, generating C source text was
not the most effective way to generate an executable image, particularly 
when many of the C (contorted) expressions themselves result in bloated
code.  It was therefore decided - as a completely separate exercise
initially - that an attempt would be made to generate assembly language
source text in a parameterised compiler back end.  This work, although not
yet incorporating every code generation (all but two apart from the
parallel code features).

A New run-time engine
---------------------
The enormous weight of overhead involved in the Sather run-time engine
support, particularly for concurrent programs, has also led to an off-shoot
project - a complete re-design for efficiency - and eventually as a very
small engine indeed which could be run on bare hardware - thus enabling an
operating system to eventually be built in Sather.

Current Status
--------------
The current state of work here at Waikato is as follows :-

     a.   The new compiler back end is running and only a few things need
     to be tidied up before booting the compiler.  Entry to Alpha testing
     after this is expected to be some time in August.

     b.   All 3Mbytes of the Required Library source (501 classes
     altogether!) and about 1Mbyte of Required Resource files (another 313
     files) have at last started Alpha testing (there have been eight
     re-writes of all of the string and character code to improve
     efficiency!).  The move to Beta testing is at the moment uncertain,
     but some time in August looks good for this too.  This testing
     includes of course the new 'Required Compiler boot version.

     c.  A new run-time engine (really a very small OS-type kernel code -
     expected size much less than 20kbytes for stand-alone and about
     40kbytes when interfacing to an OS) protocode should be complete by
     the end of June.  Work on a full implementation will depend on
     availability of student-power!

To Do
-----
Our to-do list includes not only those components of the work which will
probably need doing here (until we have a new compiler completely booted!),
but also a number of sub-projects we would welcome from other contributors.
 [If anyone would like a winter in NZ we are quite happy to host academic
staff who may be eligible for sabbatical leave]
Things which we would like to do here (effort permitting) in addition to
fully distributable versions of the above are :-

     a.   Port the parameterised back-end to another CPU - probably a
     (Strong)ARM to assess the effort involved.

     b.   Design and implement a tree-walking data flow analyser and code
     evaluator for the Sather compiler for inclusion in a new production
     compiler as a combination of the new back-end and new Required Library
     enhancements, using the new library.

     c.   Re-implement the boot version of the cultural compiler using the
     new Required Library.  The program takes text form descriptions of
     cultural features in the form specified in ISO/IEC 14651 & 14642 and
     produces culture independent binary configuration files for use by
     Sather (and other?) installations.

     d.   Produce a free standing code converter utility for re-coding
     resource files where natural language translation is not involved.

     e.   Produce a resource editor for use by a translator when preparing
     Sather resources for use in a culture using a different natural
     language (and optionally a different character code standard).

Thinking about other tools/enhancements which also need work has led us to
produce skeleton designs for a number of separate libraries - and ideas for
others which other contributors may care to take up (using the boot version
of the Required Library compiler and the beta test version of the Required
Library itself!) :-

     a.   Windows - not built on already overweight sub-strata!  This has
     to be an absolute priority design and implementation issue - anyone
     wishing to do this work should ask for our skeleton design (which
     integrates neatly into the Required Library architecture. We see this
     as needing to sit directly on Win32 primitive DLLs, X server
     directly(?), RISCOS/NCOS WIMP, etc.

     b.   Comms/Network library - fully portable of course but including
     facilities for serial-port/modem, IrDA, Ethernet, ASTM, etc.

     c.   CORBA - some people have already indicated possible interest in
     doing this.

     d.   Add full 3-D geometry services to the Required Library Geometric
     sub-section.

     e.   "Compiler" - most programs are involved in extracting meaning
     from keyed input or text stream - and providing precisely defined
     output.  Although these facilities are, indeed , perfectly general,
     they are most often associated only with a compiler - hence the
     place-holder name!

     f.   Devices - of any kind!

     g.   Vector Drawing

     h.   Painting (pixel map manipulation)

     i.   Maths - building on top of the few classes in the original Sather
     distribution which found themselves in this library!

     j.   Accounting.

     k.   Simulation.

Where Next!
-----------
A number of suggestions have been made about modifying the Sather language.
 Since the language is one of the most type-safe languages yet devised then
we at Waikato are opposed to any development which would endanger/weaken
this.  We foresee that it should be ideally suited to operate in
safety-critical environments and applications; to this end we are looking
carefully at existing loopholes in both language and implementation.
A further area which we believe is much more pressing is that of Library
name spaces and the ability to provide a 'semi-binary' filed form of
generic classes so that new libraries may be produced for use with Sather
by commercial enterprises wishing to retain the rights to not distribute
their source text in order to protect their market!

kh

-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Sather Required Resources
Date: Thu, 27 May 1999 20:19:01 +1200

The task of producing fully internationalised software requires the
elimination of all text literals (strings and individual characters) from
program source.  All text needs must be defined in that environment in
which the program is executing.
In addition to this, there is an increasing number of intellectual concepts
(eg date, number, money, address) the textual representation of which is
being parameterised as part of international standardisation efforts.
In addition to these factors, there will inevitably be domain specific
conversions relevant to an application domain, rather than just a specific
program package.
All three of these features have had to be implemented in culturally
independent classes, with or without parameters.
As standards are amended from time to time and new domain specific
libraries are created for Sather so it must be relatively simple to amend
the basic Required Library cultural classes and the concomitant Resources. 
Three kinds of resource are provided :-

     a.   The source and binary forms of culture-dependent specifications
     as specified in the particular national cultures in accordance with
     ISO/IEC 14652 & 14651.

     b.   A group of Required Library formatting classes and the
     corresponding Required Resources. Note that domain libraries will need
     to add their special requirements to this mechanism following the
     structure of the Required Library and Required Resources.

     c.   Lexical token and message formatting data produce in the
     language, culture and form of representation (including character
     encodings which may even be machine specific!!).

The culture source specification needs compilation into binary form for
internationalised use of the localised culture specification. For this
purpose an auxiliary Cultural compiler called 'build' is provided as an
adjunct to the Required Resources.

The compilation only involves a handful of files.  The major cultural
resource requirements are in terms of two kinds of resource :-

     a.  Value domains which have a discrete domain of values which are
     representable by a simple lexical token (ie are not constructed
     dynamically as number representations are) - such concepts as 'days of
     week', 'standardised character encoding names (and aliases too!).

     b.   Messages which contain value formatting expressions for use when
     composing responses/requests to a human user.

The data of both of these groups needs to be represented using a local
character set, encoding and appropriate natural language lexical tokens. 
Files containing such text need to be transformed for each culture and
coding.

The Required Library contains some seventy classes with corresponding
resource files for each culture.

When Required Library testing has been completed, files will be provided
for the international standard (see ISO/IEC 14652) default culture and the
two cultures en_NZ and mi_NZ used in New Zealand.   For the US, of course
there will generally only need to be minor spelling changes and,
potentially a translation of encodings.

A code conversion facility is part of the Required Library and a simple
code conversion utility will also be distributed.


kh

-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Language vis a vis Required Library
Date: Thu, 27 May 1999 20:17:05 +1200

The nomenclature in the title refers to the revised implementation of the
compiler and libraries which are being worked on here at Waikato university
- not the 23rd variant of Sather-1.2.

Our work on the compiler has, in fact produced four interim variants :-

     a.   A version which runs under Win32 and linux.  We have not been
     able yet to run distributed programs on these hosts.  At the time of
     writing it is not intended to pursue this any further.

     b.   A version of a above which runs under RISCOS on a StrongARM
     processor, modified for OS file path naming and to circumvent problems
     with the heap-based execution stacks!

     c.   A version of a which is having its back end replaced (and a few
     necessary fixes to its middle!) by a version producing x86 assembler
     source text for the gasm assembler.

     d.   A version of a which has been modified to include the semantics
     of AVAL{BIT} and other changes needed to handle multi-octet codes.
    This has involved the revision of CHAR to be a 32-bit entity and the
     revision of STR to include width and culture references.

Once all of these versions are fully-tested then they will be merged into a
more portable version using the new Required Library, a compiler which it
is hoped will be of production quality.  It is intended that this will be
done by specifying each feature in vdm-sl then using that as the basis for
pre/post conditions and implementation and then rigorously testing
according to a formal specification of the language (again using vdm-sl
with the IFAD VDM toolkit to help!).

[Incidentally, that Toolkit will generate C++ - but why couldn't it
generate Sather instead?]

The library design, together with the design of the cultural Resources
mechanism has led to a need to separate the implementation from the
language as far as possible.  We have therefor defined those classes
necessary to the language specification from those in the Required Library.
 In so doing we have identifies a lot of Required Library classes.  The
classes in this library are required to implement the basic O-O language
functionality in the presence of culturally dependent specifications of
features.  This has been done in accordance with the developing
international standards in that field.

Language Classes
----------------
The Sather language is therefore defined as having the following classes :-

     a.   Abstractions

          (1)  $OB - the parent of all classes.

          (2)  $LOCK - the parent of concurrent synchronisation classes.

          (3)  $PORT - the parent of all device/channel mechanism classes.

          (4)  $EXT_IDENT - the parent of all references/handles used in
          the execution environment name space.

     b.   Implementations

          (1)  BIT - the component out of a sequence of one or more of
          which a value may be created.  It has two conversion routines
          from and to a numeric value in which 'set' is equated to the
          numeric value 1, 'clear' to the numeric value 0 - irrespective of
          their actual representation!

          (2)  BOOL - this class is required to implement a stored program
          of any kind involving choices.  It has language defined values
          'true', 'false'. The string representations of these values yield
          corresponding names in the culture in which a program is
          executing.  An object or returned value of this class may be
          allocated arbitrary bit-patterns for storage purposes.  These are
          implementation dependent and need not be the same storage
          patterns as those used in other places or on other occasions
          which might arise.

          (3)  REFERENCE - this class derives from $EXT_IDENT and is an
          implementation of handle provided by the OS environment in which
          a program runs.  Apart from this semantic association a value of
          type REFERENCE is otherwise an uninterpretable bit-pattern.

      c.  Constructors

          (1)  AVAL - constructors a value as a contiguous bit-pattern from
          the indicated number of immutable objects - eg BIT (but also,
          say, INT - but NOT BOOL!!).

          (2)  AREF - construct an array of objects of the kind indicated
          by the element class.  The array so created is stored in an
          implementation-defined manner.

          (3)  "TUP" - we propose to retain the existing class for now, but
          are not very happy with it.  We would rather envisage a language
          specification which stated that in any tuple class the attributes
          will be in adjacent locations in the order of occurrence in the
          source text, separated only by any padding needed to satisfy
          implementation-defined storage alignment requirements,  This
          means in practice that AVAL{BIT} with 8, 16 and 32 - even 64? -
          bits could be placed contiguously.  This is very important when
          handling external device chip interfaces.

Library Classes
---------------
The Required Library is divided into 18 sections as follows :-

     Basic          Binary         Codes          Concurrent
     Containers     Cultural       Date-Time      FileSys
     Geometric      IO             LowLevel       NonNumeric
     Numeric        Opsys          Represent      SatherRT
     Strings        Text

 - a grand total of 501 classes in 243 files!

An Excel spreadsheet of inheritance and another for inclusions are
available on request - but, be warned - they each occupy over six square
metres of wall space at half scale!

The following are a few brief explanatory notes about the library
sections :-

     a.   Basic contains the definition of the language classes for
     interfacing purposes plus the abstractions which were in the original
     distribution library Base section.

     b.   Binary - probably self-explanatory - BITs, OCTETs, HEXTETs along
     with BINSTR, FBINSTR, etc.

     c.   Codes - all to do with character codes and code strings and code
     mapping.

     d.   Concurrent - all of the Sather 1.2 concurrent library updated for
     cultural and other changes.

     e.   Containers - divided into sub-sections!

     f.   Cultural - value formatting descriptors, ordering dependencies,
     repertoires and other cultural resources.

     g.   Date-Time - dates, times, elapsed, time-stamps, etc.

     h.   FileSys - directories, labels, file and search paths.

     i.   Geometric - included to permit page sizes to  be specified as a
     SIZE value for internationalisation purposes.

     j.   LowLevel - for trap/svc/swi, etc interfacing to an OS, etc.

     k.   NonNumeric - an enumeration facility, bit-sets, bool address,
     etc.

     l.   Numeric - including ranges, rat and money!

     m.   Opsys - A portable interface abstraction.

     n.   Represent - all value conversions to and from text strings -
     principally as partial classes.

     o.   SatherRT - SYS only at the moment.

     p.   Strings - Abstract and partial classes.

     q.   Text - what it says - but also RUNE,RUNES,FRUNES!!


kh

-----------------------------------------------------------------------------
From: nobbi@cheerful.com
Newsgroups: comp.lang.sather
Subject: Re: Learning Sather
Date: Mon, 31 May 1999 06:56:21 +0200

Michael Dingler <mdingler@mindless.com> wrote:
> Hi folks,
> 
> I just wanted to ask -- with the ongoing transition to
> a 'GNU Sather' -- whether learning Sather now seems
> sensible. 
> Having to learn yet another OO language when the next
> compiler version considerably breaks even the syntax
> isn't what I was thinking of (C++ scarred me enough ;)

The syntax will definitely not be broken. Maybe there'll be changes to in
somewhere in the future (Not in the next view versions!) But they'll only
affect details. You'll not have to unlearn anything you know about the
language.

Only drawback in learning Sather: You may have a hard time going back to C++
on any other "OO"-language because you'll miss much of the features Sather
offers. :-)

On the other hand - even your C++-Programming may improve because
you learn to *think* real OOP.

> So how much will be changed when going from Sather 1.2b
> to GNU Sather? Just the implementation and libraries or
> the whole language? How long do I have to wait for a
> stable version? (of the language, not the compiler)

ICSI 1.2b will become GNU 1.2.0 soon and the 1.2.x-line will be used only for
stabilization (Even though 1.2b is very usable already). Parallel to that,
The next release is prepared down under. It is hard to say, how long we will
stay in beta with that. Maybe sometime in fall this year? But as I said
before - this should not affect anybody learning the language.

Ciao,
Nobbi

Copyright (C) 2000 Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.