Copyright © 1994 by The OPAL Group
This manual describes how to add handcoded structures to the OPAL Compilation System Version 2.
OPAL is a scheme language, by which we understand that it doesn't provide a priori more than the essential builtin data types of booleans and textual denotations. Consequently, the OPAL compiler must support a way to add handcoded structures to the compilation system without loosing efficiency compared to as data types defined by these structures are built into the language.
There is a second -- even more important -- reason which is independent of OPAL being a scheme language. In most cases software is not constructed from scratch, but is embedded in and does embed existing software: the operating system, the user interface, the data base manager and so on. This calls for a well defined interlanguage working interface. OPAL programs must have the possibility to access existing software and must themselves be accessible from other software. Nowadays C is the de facto standard for system programming; hence, it is desirable that OPAL programs can corporate with C software and vice versa.
The OPAL compilation system Version 2 uses ANSI C as its target language. The decision to use C has been driven by the demand to produce highly platform-independent code, while preserving some access to machine oriented features like pointer calculations. Although this approach is not totally satisfactory, since for example the generated C code is (redundantely) parsed and context-checked by the C compiler, we expect that it will survive some time.
So we are in the happy situation that the target language -- to be used for handcoding -- and the foreign language -- to be used to connect to existing software -- fall together. Nevertheless, the handcoding scheme documented in this manual is well prepared for future changes of the target language. Only those features crucial for effiency of handcoded low-level data types will change if the target language changes; those are grouped mainly around the inlining of function definitions via the C macro-preprocessor. Handcoded structures which avoid these features will probably port to future versions of OCS.
Before you plan to handcode an OPAL structure be aware that this is a job for the brave. You should be familiar with writing C programs as well as with writing OPAL programs. You should have some idea of how functional languages are compiled. On the level of handcoding the (weak) type discipline of C is completely broken; this is necessary to model the richer type system of OPAL. Face that you have a lot to do with tedious and tricky low-level memory management -- stuff the compiler normally handles automatically for you. Be aware that debugging handcoded structures, in particular if they are concerned with more complex data objects, is a boring task. The usual crash you have to expect occures in some of the memory management functions of the runtime system, since a dangling pointer has been accessed, which has been introduced in completely different part of the program.
A handcoded structure consists of four source parts, each kept in a separate file. The signature and implementation parts are ordinary OPAL documents. Additionally, there are a handcoded interface and a handcoded implementation part, both written in C. The handcoded source parts are combined internally with the C interface and implementation parts derived from the OPAL sources.
The four source parts of a handcoded structure are:
.sign
.impl
/$ handcoded
$/
apperars somewhere in the file. (1).
.hc.h
.hc.c
Let us take a view at a simple handcoded structure. Not all information to understand this example in detail has been presented yet, but it might serve to give you a first taste.
As an example we use a simplified version of the structure DEBUG
from the OPAL Standard Library, which provides a function for
side-effect prints. The OPAL signature part looks as follows:
SIGNATURE DEBUG[data] SORT data FUN PRINT : bool**denotation**data->data
The OPAL implementation uses a handcoded internal function
print
to realize the side-effect print:
IMPLEMENTATION DEBUG /$ handcoded $/ DEF PRINT(true,msg,data) == print(msg,data) DEF PRINT(false,msg,data) == data -- function to become handcoded: FUN print: denotation ** data -> data
The handcoded interface part is empty in this example:
/* hand-coded interface part of DEBUG */
The handcoded implementation part makes use of the runtime system
function get_denotation
, which converts an OPAL object of
sort denotation to a C string, and a global char buffer named
charbuf
@c{(see Denotations)}.
/* hand-coded implementation part of DEBUG */ #include <stdio.h> extern OBJ _ADEBUG_Aprint(OBJ msg,OBJ data) /* print */ { fprintf(stderr,"DEBUG PRINT:\n"); get_denotation(msg,charbuf,CHARBUFSIZE); fputs(stderr,charbuf); fputc('\n',stderr); return data; } static init_const_ADEBUG() {}
The OPAL compiler generates C interface and implementation parts from the OPAL parts of a handcoded structure. These files are normally not of interest to the user, but it usefull to understand how the final C source is assembled from the generated and handcoded C parts.
[OCS/]
Struct.h
.hc.h
.
[OCS/]
Struct.c
.hc.c
.
To continue the example from the last section (see A Look and Feel Example),
the derived C interface for DEBUG
follows. We have added comments
to illustrate the general scheme:
#ifndef ADEBUG_included #define ADEBUG_included /* Name aliases */ #define __ADEBUG_APRINT __ADEBUG_1 #define _ADEBUG_APRINT _ADEBUG_1 #define ADEBUG_1 ADEBUG_APRINT /* Extern Declarations */ extern OBJ __ADEBUG_APRINT; extern OBJ _ADEBUG_APRINT(OBJ,OBJ,OBJ); /* Inclusion of handcoded part */ #include "DEBUG.hc.h" /* Default Definition of Macro Implementations */ #ifndef ADEBUG_APRINT /* May be overwritten in hc.h part */ #define ADEBUG_PRINT(x1,x2,x3,r) {r=_ADEBUG_APRINT(x1,x2,x3);} /* Map to function implementation */ #endif #endif /* ADEBUG_included */
The derived C implementation is the following:
/* Inclusion of runtime system definitions */ #include "BUILTIN.h" /* Inclusion of imported and own definitions */ #include "DEBUG.h" /* Name Aliases for internal functions */ #define __ADEBUG_Aprint __ADEBUG_2 #define _ADEBUG_Aprint _ADEBUG_2 #define ADEBUG_2 ADEBUG_Aprint /* Function Declarations and Closure Variable Definitions */ extern OBJ _ADEBUG_APRINT(OBJ,OBJ,OBJ); OBJ __ADEBUG_APRINT; /* PRINT */ extern OBJ _ADEBUG_Aprint(OBJ,OBJ,OBJ); OBJ __ADEBUG_Aprint; /* print */ /* Inclusion of handcoded implementation part */ #include "DEBUG.hc.c" /* Generated C code for PRINT */ extern OBJ _ADEBUG_APRINT(OBJ x1,OBJ x2,OBJ x3) /* PRINT */ {...} /* Closure Evaluation Entries */ .... /* @c{Initialization Entry} */ init_ADEBUG(){ static int visited=0; if(visited) return; visited=1; ... init_const_ADEBUG(); }
The OPAL compiler is capable of generating templates for the
handcoded parts of a structure. This is supported by the OCS
drivers ocs
resp. ors
. The intended procedure is as
follows:
/lib/om/tmpls/SysDefs.subhc.tmpl
). Most of the settings to
be filled in correspond to those of ordinary OPAL top-level or
subsystem templates (See Users Guide to OPAL). Only the
variables
ocs
or ors
-- you must use the system description template.
DEF f(x) == f(x)
), since the compiler decides for which
functions templates are generated on the basis whether a function is
implemented or not.
ocs
resp. ors
. The OPAL parts of the handcoded
structure will be compiled to the derived files (usually located in the
OCS
subdirectory) as well as to two files named
Struct.hc.h.tmpl
and Struct.hc.c.tmpl
.
ocs
will stop with a message
like `Don't now how to make target Struct.hc.c'. This is
perfectly okay, since ocs
actually doesn't know -- its your job.
You usually now derive the handcoded parts from the templates. The
template for the header is just empty. The template for the
implementation file contains definitions for all unimplemented
functions, with the bodys filled up by a HLT
statement.
Per default, the OPAL compiler does not generate interfaces to
access plain structures from within handcoded structures. To instruct
the compiler to do so, you must use the same system description file as
for handcoded system (see Maintaining Handcoded Structures), and fill in the variable
NORMSTRUCTS
with those structures you wish to access from
handcoded structures. For these structures the compiler generates
derived interfaces [OCS/]
Struct.h
, which behave similar as
for handcoded structures -- execpt that the handcoded interface and
implementation part is regarded to be empty, and thus not included.
OPAL functions are referred to by numeric names in the generated C code. To enable comfortable handcoding the derived C parts of a hancoded structure declare symbolic aliases for numeric names. However, the symbolic names cannot be one-to-one translations, since the lexical rules of OPAL and C are not compatible. Furthermore, the namespace of C programs is flat, which requires to augment names with the originating structure. Last not least overloading of names originating from the same structure has to be coped with.
A transliteration maps each valid OPAL identifier to a valid C identifier. The method is as follows.
Ident ::= Fragment ( `_' [ Fragment ] )* Fragment ::= Letgit+ `?'* | Special+(The character class Letgit+ is defined in the report The Programming Language OPAL). Note that `_' is itself a special letter. Ambiguity is resolved as follows: if the current fragment consists of specials, underlines are taken to be members of this fragment, except a trailing underline which introduces the next letgit fragment. If the identifier ends with a letgit fragment followed by a single `_', this is interpreted as the introduction of an empty trailing special fragment.
Opal ! @ # $ % ^ & * + | ~ - = \ ` { } [ ] : ; < > . / ? _ C 1 2 3 4 5 6 7 8 p S t m e b q O C o c i a l g d s _ uIf you have a US keyboard, notify the correlation between the numeric keys and the special character associated with the according shifted key.
The following table shows some normal and some pathological examples of transliterated identifieres:
DEBUG ADEBUG empty? Aempty_ ::? Sii_ ::?__ Sii_uu ::?__foo_ Sii_u_Afoo_S
Each OPAL function has an associated basename. The basename is used to generate the names of the different C objects associated with an OPAL function, see Functions. The basename is constructed as follows:
Caution: The naming scheme for overloaded functions from the same structure is rather dangerous. You might add an overloaded function to an existing handcoded structure if it is declared textually after the current versions, or you might delete the textually last version. In all other cases you have to watch out for changed names and update your sources accordingly. If you make a mistake, there is no one who complains. We recommend not to use overloaded names at all for handcoded functions. You can always perform a simple renaming in the OPAL implementation part to avoid overloaded names in the handcoded part, for example:
FUN f: s -> s FUN f: t -> t -- rename to avoid overloading FUN f_s: s -> s FUN f_t: t -> t DEF f == f_s DEF f == f_t -- f_t and f_s handcoded
One of the main differences between functional and imperative languages is the treatment of data objects: in functional languages the notions of memory and reference do not exist and data types are described by rather abstract and possible recursive definitions. In imperative languages the notion of memory and reference is most important, since dynamic objects can only be represented by pointer structures. Moreover, deallocation of memory which is not longer used has to be performed manually.
Clearly, the difference in the handling of data objects is the factor which complicates a powerful interlanguage working interface between functional and imperative languages.
In OCS, every OPAL data object is represented by a uniform C
object of type OBJ
. This uniformity is necessary to support
parametric resp. polymorphic function and data type
definitions. Usually, type OBJ
is defined in C as void *
,
but you should treat it as an abstract entity.
An object of type OBJ
either stands for a primitive,
self-containing value, or is a reference to a structured heap cell. The
first kind is called for short primtive object, the second kind
structured object, which subsumes the reference as well as the
heap cell.
Each value of type OBJ
is equipped with a type tag in order to
distinguish the two kinds.
OCS uses an enhanced lazy reference counting approach to memory management which results in a residual garbage collector, that is, the garbage collector is compiled online into the code. This makes hand-coding more tedious, since the handcoder also has to perform garbage collection within his code, as long as she calculates with OPAL data objects. More exactly, the handcoder is reponsible for correctly maintaining the reference counter.
A primitive object stands for a self-containing value. To access the value
the object must be unpacked, that is, the type tag must be removed. To
create a primitive object the value must be packed, i.e. the type tag
must be set. Be aware that passing around unpacked values as OBJ
s
almost certainly crashes the programm, since they may be interpreted as
references by the runtime system.
The following macro tests whether an object is primitive:
There are several functions to handle primitive objects which represent unsigned integers:
WORD
, usually 31.
These are the functions to handle primitive objects which represent signed integers:
In some situations it might be useful to treat a pointer, returned by a
C library function or explicitely created by malloc
, as a
primitive object. A typical situation are pointers of type FILE*
from the C standard library.
Pointers can be packed and unpacked:
Be aware that these functions rely on pointers always pointing to even byte boundaries. The following macro symbol tells you whether this is not the case, and you must explicitly align pointers by yourself (or do not use this feature at all):
malloc
are aligned on even byte boundaries.
Currently, we are not aware of any UNIX architecture which requires this flag to be set.
Please note that memory allocated from the runtime system is always correctly aligned. See Allocating Auxiliary Memory.
Structured objects are references pointing to heap cells. Structured objects in this sense encompass both the reference as well as the cell.
OCS distinguishes between deep, flat, and byte-flat structured objects. The cell of a deep object consists of other objects which are subject to garbage collection. A flat object consists of arbitrary data which is not subject to garbage collection. A byte-flat object may in most cases be identified with a flat object; an exeception is persistent binary data storage and network interchange, where byte-flat objects have to be distinguished from normal flat objects to allow the exchange of byte streams between architectures with different endians.
Each of the three kinds of objects is furthermore partitioned into small and into big objects. Small objects are allocated and deallocated using an array of size-specific free lists; big objects share the same free list regardeless of their size.
The kinds of structured objects are determinated at allocation time. There are no possibility and no necessarity to test object kinds. There is, however, one macro which generally tests whether an object is structured:
The layout of a structured object is declared by a plain C struct
declaration. As the first component, however, you have to include a
header predefined in the runtime system.
There are two kind of headers, one for definitively small objects, and one for objects which may be either small or big. The first kind is normally used for objects of fixed size (which are not bigger then at most 127 words), the second for objects with dynamically growing size. Small objects require one word less of memory resources then big objects with respect to the overhead of memory management.
The following header is used for declaring a small structured object:
A typical example is the declaration of reals:
typedef struct sREAL { struct sCELL header; double value; } * REAL;
Please note that this must become a flat object, since the double value should not be subject to garbage collection. However, the flatness of a structured object is determinated at allocation time, see Allocating Structured Objects.
The following header is used for declaring a big structured object:
A typical example is the declaration of dynamically sized arrays:
typedef struct sARRAY { struct sBCELL header; /* ... data ... */ } * ARRAY;
The region following the header will hold the arrays data. Most probably this will become a deep structured object, since the array data consists of objects again.
There are two macros associated with this kind of cells which allow you to access the data as an array of objects and the size of this array.
The deepness or flatness of structured objects is determinated at allocation time. This information is stored in the cells header, such that deallocation can be performed without knowledge about deepness or flatness. This is crucial to realize parametric resp. polymorphic objects.
Six macros for allocation of structured objects are predefined. The size argument must always be calculated with one of the size macros given below.
There are several macros to determinate the required size of a structured object:
sizeof_small(struct sREAL)
.
size_data(n * sizeof(OBJ))
where n is the number of
objects which shall fit in the data area.
The approach of reference counting to garbage collection is rather simple. The cell of a structured object contains a counter holding the number of all the references pointing to it, the so-called reference counter. If this counter drops to zero, the cell becomes isolated and can be collected as free memory.
To make the scheme work, everytime a new reference is created, the reference counter must be incremented, and everytime one is deleted, it must be decremented. Creating a reference is done by one of the allocation functions -- in which case the reference counter is correctly initialized -- or by copying an existing reference. Deleting a reference happens finally when the life-time of an automatic variable holding a reference expires.
The reference counting scheme relies on the fact that there are no cyclic dependencies between structured objects. The compiler ensures this per definition; the handcoder himself is responsible for cyclic dependencies in the handcoded structures.
The runtime system provides several macros to copy and delete references. The most general macros are used in such situations in which it is not clear whether an object is structured or primitive:
If its is known that an object is structured, the more specific macros for reference counting should be used.
Selective Updating is an important source of optimization used in the OPAL compiler. The following macros may be used to realize it on the level of handcoding:
A typical example how to use these macros for selective update is an update function on handcoded arrays:
OBJ _AArray_Aupd(A,i,x){ OBJ result; if (excl_structured(A,1)){ result=A; } else { result=duplicate_array(A); decr_structured(A,1); } FREE(data_big(result)[unpack_word(i)],1); data_big(result)[unpack_word(i)] = x; return result; }
Here, duplicate_array
is an auxiliary function which duplicates
an array.
In order to allow for maximal sources of selective updating, the number of references to some cell should be minimized. The borrowing technique may be used for this purpose. It also allows deep objects to become freed as flat objects; this is significantly faster, since the runtime system has not to perform subfrees on the components any more.
Borrowing is based on the following macros:
Borrowing is not very often performed in handcoded structures, since it is mainly of interest for recursive data types. Those data types should be implemented in OPAL; here the compiler does most probably a better job. Anyway, to illustrate borrowing (as it is also performed by the compiler), we define a recursive sequence-like type as follows:
typedef struct sSEQ { struct sCELL header; OBJ ft; OBJ rt; } * SEQ;
A function which selects both the ft
and the rt
element
from a sequence s
now typically contains the following code:
... { OBJ ft = s->ft, rt = s->rt; if (excl_structured((OBJ)s,1)) { /* borrowft
andrt
froms
, and treats
as flat. */ dispose_structured_flat((OBJ)s); } else { copy_some(ft,1); copy_some(rt,1); decr_structured((OBJ)s,1); } }
A function which selects only the rt
element has two
possibilities for borrowing. The first one is to explicitly clear
the rt
component using the constant NIL
:
... { OBJ rt = s->rt; if (excl_structured((OBJ)s,1)) { s->rt = NIL; dispose_structured((OBJ)s); } else { copy_some(rt,1); decr_structured((OBJ)s,1); } }
This version has the disadvantage that a flat dispose cannot be
performed, since the ft
component of s
is still
bounded.
The second version explicitely frees the ft
component in order to
perform a flat dispose. This version, however, makes only sense in this
particular example, since the number of components which must be freed
ist rather small:
... { OBJ rt = s->rt; if (excl_structured((OBJ)s,1)) { free_some(s->ft,1); dispose_structured_flat((OBJ)s); } else { copy_some(rt,1); decr_structured((OBJ)s,1); } }
In certain circumstances it is useful to fix the storage classes the compiler assigns to handcoded data types in order to give it more exact information about the data type.
Assume, for instance, the data types of natural numbers (nat
) and
arrays (array
) -- both of which are handcoded in the OPAL
standard library. Natural numbers certainly are primitive objects, no
reference handling, therefore, has to be done on natural numbers. In
order to give the compiler the information not to perform reference
counting (e.g. in the functions it generates automatically) a simulated
storage class can be given by
DATA nat == somePrimitive
Arrays always are deep structured objects. To give the compiler this information a simulated storage class can be given by
SORT dummy DATA Array == someStructured(dummy1: dummy, dummy2: dummy)
Note that it is necessary to give at least two components in the
declaration, since the compiler otherwise eliminates the constructor,
and that sort dummy
should not be implemented such that the
compiler does not know its storage class. To simulate a flat structured
object, you would have for example sort bool
instead of
dummy
.
In the functions generated automatically the compiler now can avoid the test whether an array is primitive or structured (e.g. before copying a reference).
Note that this feature will be provided in a much cleaner way by specialized compiler pragmas in the near future. (See The Programming Language OPAL)
We recommend to use the following conventions to declare handcoded data types which are exported accross structure boundaries. They are also used for the handcoded structures of the OPAL standard library, and allow yourself and others to deduce the existence and semantics of functions more easily.
With each introduced primitive object type TYPE, supply a type
definition and corresponding pack_type und unpack_type
functions. For example, the handcoded interface part of the structure
Nat
from the OPAL library starts as follows:
/* hand-coded interface part of Nat */ /* representation */ typedef WORD NAT; #define pack_nat(x) pack_word(x) #define unpack_nat(x) unpack_word(x) /* macro based implementations */ ...
With each introduced structured object type TYPE, supply a structure definition sTYPE and a type definition TYPE which is a pointer to the structure, together with the following macros:
For example, the handcoded interface part of the structure Real
from the OPAL library starts as follows:
/* hand-coded interface part of Real */ #include <math.h> /* representation */ typedef struct sREAL { struct sCELL header; double value; } * REAL; #define alloc_real(r) alloc_small_flat(sizeof_flat(struct sREAL),r) #define make_real(v,r) {alloc_real(r); ((REAL)(r))->value = v; } #define copy_real(o,n) copy_structured(o,n) #define free_real(o,n) free_structured(o,n) #define excl_real(o,n) excl_structured(o,n) #define decr_real(o,n) decr_structured(o,n) #define dispose_real(o,n) dispose_structured(o,n) /* macro based implementations */ ...
The runtime system of OCS uses its own memory allocation methods
and free memory pool. Memory allocated by the runtime system will never
be released such that it can be reused by the C library function
malloc
and its derivates. Hence, if you require auxiliary
memory for your handcoded implementations, you should use the following
functions for allocating and freeing it:
malloc
, but allocates from
the OCS runtime system memory pool. This function always return
properly aligned pointers for use of pack_pointer
and
unpack_pointer
(see Pointers as Primitive Objects).
free
, but frees memory
formerly allocated by alloc_aux
.
One of the characteristics of the compilation scheme used by OCS is to use no special function calling protocol, but to imitate as closely as possible the normal C calling conventions. This should enable the C compiler to optimize function calls w.r.t. the specific target architecture. For handcoding of OPAL structures it has the pretty side-effect that handcoded or compiler generated OPAL functions are almost plain C functions.
However, one has to be aware that functions are first-order citizens in
OPAL, but not in C. This problem is coped with as follows: for
each global OPAL function exists one C data object, the so-called
closure object of type OBJ
. This is the representation of a
function as a first-order citizen, which may be passed around as a
parameter and stored in other data objects. It is subject of reference
counting as are other data objects. See Closure Objects.
Furthermore, for each global OPAL function exists a direct call entry. This is actually a plain C function which can only be used in full applications (that is, applications which supply all arguments of a function according to its rank, see Ranks and eta-Enrichment). The direct call entry is what you actually code in C. The closure objects are automatically created by the derived C implementation part of a structure. See Direct Call Entry.
To support the efficient realization of handcoded basic data types there is third kind of C object associated with each OPAL function: the macro expansion entry. You can supply macro definitions of functions in the handcoded interface part of a structure. If you don't give macro entries, default ones are supplied by the derived interface part. See Macro Expansion Entry.
The different kinds of calling entries only exist for `real' functions with rank greater then zero. The value of OPAL Functions of rank zero are evaluated at initialization time of a structure, and stored in a global variable. See Constants.
The rank of a function is the number of arguments supplied in its
definition. For example, the definition DEF @(o)(x,y) == o(x,y)
has
rank 3 and the definition DEF @(o) == o
has rank 1, although both
definitions (nearly) represent the same semantics (if o
represents
the same function in both cases).
Unfortunately, the rank of a function is a private property of the implementation of a structure. If inter-structure optimizations are enabled, this property is propagated and exploited for optimizations (creating recompilation dependencies between structure implementations), but there is currently no way to access it in handcoded structures.
To allow the prediction of function ranks the OPAL compiler
performs what we call eta-enrichment of handcoded
structures. This means that additional arguments are appended to the right
and left hand sides of function definitions according to the arity of
the functionality of the function. If FUN @ : (s ** s -> s) -> s ** s
-> s
, for example, the definition DEF @(o) == o
will be
actually translated to DEF @(o)(x,y) == o(x,y)
. Hence you can
predict the rank from the arity of the functionality which is a
non-private property visible in the signature part of a structure.
(2).
The direct call entry of a function is named _
basename (for
function basenames see Basenames of Functions.) At least this function has to be
supplied for each function which is to be handcoded.
The direct call entry takes as many arguments of type OBJ
as the
functions rank is. If the result of a function is not a tuple, it returns a
single OBJ
. For example, the following direct entry implements the
successor function on natural numbers:
OBJ _ANat_Asucc(OBJ n) { return pack_nat(unpack_nat(n) + 1); }
If the result of a function is a tuple, the direct entry returns one of
the predefined tuple structure types TUP
n from the runtime
system:
typedef struct { OBJ c1, c2, .... , cn; } TUPn;
For example, a possible implementation of the function divmod
on
natural numbers is:
TUP2 _ANat_Adivmod(OBJ n, OBJ m) { TUP2 res; NAT cn = unpack_nat(n); NAT cm = unpack_nat(m); res.c1 = cn / cm; res.c2 = cn % cm; return res; }
Note that when the direct function entry is called, it owns the
references of structured object parameters. If the function uses a
parameter n times by passing it to other functions or returning
it, it has to perform n-1
copies of the reference; if
n is zero, it has to perform one free on the reference. This
naturally generalizes to local objects created in the course of a
computation.
The macro expansion entry of a function is named basename (for function basenames see Basenames of Functions.) It allows for inline expansion of short function definitions. Inline expansion is crucial for basic data types like natural numbers; handcoded structures making use of this feature, however, may not port to long-term future generations of OCS.
The macro entry takes as many arguments as a the functions rank, and additional l-values used to store the result(s) of the function evaluation. The sucessor function on naturals may be coded as a macro entry as follows:
#define ANat_Asucc(x,r) {r=pack_nat(unpack_nat(x) + 1);}
For functions with tuple results, consecutive result parameters are supplied:
#define ANat_Adivmod(n,m,r1,r2){ \ NAT cn = unpack_nat(m), cm = unpack_nat(m); \ r1 = pack_nat(cn / cm); r2 = pack_nat(cn % cm);\ }
You usually place the macro entry of a function in the handcoded
interface part of a structure. Be aware of including headers of other
handcoded structures, if you refer to definitions supplied by them. For
example, if the function divmod
is implemented in a separate
structure DivMod
based on the structure Nat
, the header
has to include Nat.h
:
/* handcoded interface part of DivMod */ #include "Nat.h" #define ANat_Adivmod(n,m,r1,r2){ \ NAT cn = unpack_nat(m), cm = unpack_nat(m); \ r1 = pack_nat(cn / cm); r2 = pack_nat(cn % cm);\ }
If you have implemented handcoded functions using macros, you still have to supply the direct call entry. This is necessary, since the direct entry is used for constructing the closure object of the function. The usual method in this case is to implement the direct entry using the macro entry:
OBJ _ANat_Asucc(OBJ n) { OBJ res; ANat_Asucc(n,res); return res; }
Please note, that the compiler uses the macro entries of imported
handcoded structures only if inter-structure optimization is enabled.
This is due to the fact that the pragma /$ handcoded $/
is a
property private to an implementation.
The closure object of a function is named __
basename (for
function basenames see Basenames of Functions.) You never construct closure
objects by yourself; they are constructed by the runtime system for you.
To evaluate a closure, the literal way is to use the macro METHOD
from the runtime system to select an evaluation method and than to call
this method:
OBJ
function (OBJ clos, OBJ arg1, ..., OBJ argn)
;
but you must establish this type by a cast.
Evaluating a closure object with 2 arguments looks as follows:
(* (OBJ (*)(OBJ,OBJ,OBJ)) METHOD(clos,2)) (clos,arg1,arg2)
To free you from writing down such boring type casts, the runtime systems supplies macros for evaluating closures with upto 8 arguments:
This section is still to be completed. [ Missing: Evaluating closures with tuple results. ]
Generally, a closure consists of a pointer to the direct call entry of a function, a pointer to a table of methods used to evaluate the closure, and any arguments which have been used to construct the function the closure represents. You do not have to understand the details but a short explanation will be useful later on.
If FUN o : (b -> c) ** (a -> b) -> a -> c
is the usual function
composition, then the application f o g
denotes a new function
constructed from applying o
to f
and g
, represented
by a closure which holds a pointer to the direct call entry of o
and the arguments f
and g
.
To obtain the particular evaluation method necessary to evaluate a given closure, the table of evaluation methods carried by the closure is indexed by the number of arguments the closure shall be evaluated with. There are basically three kinds of methods for evaluating a closure:
Closures are in principal structured objects and thus need reference counting. Two functions are used for reference counting on closures:
The value of OPAL functions of rank zero are calculated at structure
initialization time and stored in a global variable. This variable is
named __
basename (for function basenames
see Basenames of Functions).
Please note that this name coincides with the closure object name of a function. Actually, a function of rank zero might very well represent a higher order object and the variable then holds a closure object. Since eta-enrichment is performed on handcoded structures, this practically does not occure in the world of handcoding, since the rank is always identical with the arity (see Ranks and eta-Enrichment).
A handcoded structure must provide a special static function named
init_const_
CIde, where CIde is the transliterated
identifier of the structure (see Transliteration of OPAL Identifiers). This function
intializes all constants of the structure. For example, the structure
Nat
may contain initialization code as follows:
static init_const_ANat() { __ANat_A0 = pack_nat(0); __ANat_Amax = pack_nat(max_word); }
Mutual dependend constant systems have to be initialized in a proper order. Since the OPAL compiler has no idea of dependencies of handcoded constants from OPAL coded constants of the same structure, it assumes that there are no such dependencies. Hence, the handcoded constants are always initialized before the other constants and thus cannot refer to them. (3)
The data types of boolean values and textual denotations are builtin into the runtime system.
The operations on builtin types are declared in BOOL.sign and DENOTATION.sign. However, these are only pseudo structures, and the associated declarations on the C level are incorporated into the runtime system headers, which are automatically included in the handcoded parts.
In this chapter the operations predefined on the two builtin types are described shortly.
The following functions operating on boolean values are available:
The function true
, false
, ~
, and
, or
etc., as declared in BOOL.sign
, are available as canonical
implementations, that is, the (closure) variables, direct and macro
entries are supplied as usually. But note that the origin is not
BOOL
but BUILTIN
; hence the basename of true
is
e.g. ABUILTIN_Atrue
(see Basenames of Functions).
Since denotations are structured objects, memory management has to be addressed in this case as described in Conventions for Declaring Data Types and Related Functions.
sBCELL
see Declaring Big Objects)
@example
typedef struct sDENOTATION {
struct sBCELL big;
OBJ leng; /* packed length */
/* ... data ... */ /* data */
} * DENOTATION;
@end example
CHARBUFSIZE
which equals -- in the current release -- to 1024. You should not
call any other function of the runtime system when using this buffer,
since they might globber it.
As for boolean numbers, for the functions declared in
DENOTATION.sign
canonical C declarations exist; however, the
origin to be used for the basename is BUILTIN
rather then
DENOTATION
.
The OPAL standard library provides a rich set of structures. Some of them are handcoded, and for all structures a handcoding interface exists such that they may be accessed from other handcoded structures.
In the following sections the a handcoding interface for basic types
(Nat
, Int
, Real
), an aggregate types (String
),
and structures used to access system services (Com
, Process
)
are presented.
Nat.hc.h
only defines packing
routines but no memory management routines for natural numbers, since
naturals are primitive. The handcoding interface part Int.hc.h
ist very similar -- except of course that all operations work on signed
values.
nat
.
nat
.
Nat.sign
defines natural numbers as a
free type with constructors 0
and succ
and several
arithmetic and boolean operations on them. These operations can be
accessed in handcoded structures by their transliterated
names, e.g. _ANat_Asucc
, as usual (see Transliteration of OPAL Identifiers).
Real.sign
can be accessed via their transliterated
names as usual.
@cindex real
Real numbers are defined as
typedef struct sREAL { struct sCELL header; double value; } * REAL;
Real.hc.h
:
string.h
, string.hc.h
) is described here in some more
detail.
In contrast to the primitive types described in see Basic Types,
memory management is essential to use aggregate types in handcoded
structures.
While the functions operating on strings which are declared in the signature
file String.sign
can be accessed by their transliterated names
(see Transliteration of OPAL Identifiers, the handcoding interface String,hc.h
offers
the following routines to perform memory management:
o
into the buffer s
of size
n
and frees it. This function returns `1' if the string
completely fits in
the buffer, `0' if not (and truncates the string to the maximal
length). The string in the buffer is zero-terminated.
Array.hc.h
declares the following routines (confer to Conventions for Declaring Data Types and Related Functions.
OPAL uses a monadic approach to input and output. The IO monad is
called command and is of sort com'Com
. Technically,
commands are data types which are passed to the run time system which
interpretes them, thus performng side-effects.
This section describes commands in more detail, shows how commands can be constructed on the handcoding level, and exemplifies their use by presenting how commands are used in the OPAL library to access the services of the process abstraction of UNIX.
Monads are a new and powerful approach to incorporate i/o-handling in functional programming languages.
In OPAL monads are implemented by commands. Commands are
terms of a particular free type (defined in structure Com
). All
functions exhibiting side-effects (possibly by calling other functions
with side-effects) are thus characterized by a return type of
com[data]
.
The free type is defined in Com.sign
as
TYPE com == yield (ans: ans) exit (value: nat) call (proc: void -> ans) -- embedding side-effect call followedBy (com: com, cont: ans -> com) -- composing commands sync (proc: void -> ans) -- embedding thread sync call resume (cont: ans -> com) -- resume after sync
The handcoding of functions with side-effects thus proceeds by the following two phase scheme:
com[data]
and implement it constructing a term of type com
using a handcoded function performing the side-effect.
@item Implement the handcoded function itself. In order to adjust the
evaluation of functional expressions the handcoded function has to be
implemented as a particular higher-order function (depending on the
function on the OPAL level). If the handcoded function should take
n
parameters p1
, ..., pn
, and return a value of type
data
, its functionality has to be FUN handcodedFunction : p1
... pn -> void -> ans[data]
As an example, consider the implementation of the function close
,
which closes a UNIX file.
File.sign
FUN close : file -> com[void]
The OPAL signature declares the function "close".
File.impl
DEF close(f) == call(xclose(f))
FUN xclose : file -> void -> ans[void]
The OPAL implementation defines the function by constructing a term
of type com
which -- in this case -- only calls the handcoded
function xclose
. Furthermore, it declares the function
xclose
. Note the extra void
parameter.
File.hc.c
Section I/O-Handling Using Commands described the concept of commands and showed how to
manage interfacing between OPAL and handcoded functions. In the
handcoded interface, however, additional functions and macros are used to
construct commands which are defined in the handcoding interface
Com.hc.c
. These are listed in the following table.
ans_okay_nil
return_okay_nil
return_okay(data)
return_fail(failure)
UnixFailures.h
, users can define additional
failure message using the following function.
@findex return_fail
OBJ declare_failure_answer(char * message)
return_fail
.
@findex declare_failure_answer
For an example using these functions/macros see I/O-Handling Using Commands.
In order to give a more complete example of interfacing between OPAL and handcoded structures using commands, this section describes the process abstraction implementing that of UNIX in the OPAL standard library.
Process.sign
self
,
self?
, fork
, execve
, kill
, wait
,
popen
, and pclose
Process forking, for instance, has the functionality FUN fork :
com[process]
, since it creates a new process. The process id returned is
that of the child in the parent process. The child process can identify itself using the function self?
.
Process.impl
wait
:
DEF wait == call(\\ _ . LET a == xwait(nil) IN IF okay?(a) THEN okay(1st(data(a)) & 2nd(data(a))) ELSE fail(error(a)) FI ) FUN xwait: void -> ans[pair'Process]Here, a term which causes the call of the handcoded function
xwait
is
constructed, which evaluates the answer of xwait
using an OPAL
expression.
Process.hc.h
pack_process
, unpack_process
). Since process ids are
basically natural numbers, no particular memory management routines are
defined.
Process.hc.c
xwait
:
extern OBJ _AProcess_Axwait(OBJ x1) /* xwait */ { int pid,status; pid = wait(&status); if (pid == -1) { return_unix_failure(errno); } else { return_okay(_AProcess_Apair(pack_process(pid), pack_nat(status))); } }
In order to call OPAL programs from C programs, the C-routine
implementing the OPAL program has to be called as exemplified by the
function main
used defined in _ostart.c
.
main(int argc, char** argv, char** environ){ OBJ ans; start_argc = argc; start_argv = argv; start_env = environ; init_ABUILTIN(); init(); COPY(command,1); return MAIN(command); }
For this function to work three global variables must be defined:
command
init
command
include
command
Prior versions of the OPAL-compiler -- up to version 2.0h -- used a slightly different handcoding scheme. This section describes the changes and how to upgrade to the handcoding scheme defined in this document.
This chapter is still incomplete.
Keeping the handcoded part independent from the OPAL implementations was a major design decision taken in the OPAL compiler up to version 2.0h. Due to this design decision handcoding had to be done on three (instead of two) levels. In between the proper OPAL and handcoded structures the handcoder had to insert the so-called "RTS"-level.
The definitions of that level have been shifted to the OPAL implementation part of a handcoded structure with compiler version 2.1a which also entailed a simplification of the argument and return value passing scheme.
Consider interfacing to the startup routine of a graphical user interface.
Besides an OPAL function initXmInterface
FUN initXmInterface : com[xmDialog] DEF initXmInterface == abs(call(mapAns(abs: RTSXmDialog->xmDialog, initXmInterface'RTSXmDialog)))
you had to declare an additional function on the RTS-level and handle the proper translation of argument and return value types.
FUN initXmInterface : RTSunit[RTSXmDialog] -> RTSans[RTSXmDialog]
With the handcoding scheme introduced in version 2.1a, this declaration is completely done on the OPAL level by
FUN initXmInterface : com[xmDialog] DEF initXmInterface == call(xinitXmInterface) FUN xinitXmInterface: void -> ans[xmDialog]
In the compiler version 2.1a minor changes to the lexical rules of
identifiers (confer to The Programming Language OPAL) have taken
place. The most important one is that underscores (`_
') are
allowed as part of an identifier in version 2.1a.
Due to this fact the transliteration scheme had to be changed. In the following the old transliteration scheme is given
_
').
v_
' is appended, if the generated
name denotes a closure variable.
The most important difference thus is that transliterated groups now have to
be preceded by an `A
' (alphanumeric) or `S
' (special).
This section is still to be written.
This section is still to be written.
This section is still to be written.
159 entries
[ _ ] [ a ] [ b ] [ c ] [ d ] [ e ] [ f ] [ g ] [ h ] [ i ] [ l ] [ m ] [ n ] [ o ] [ p ] [ r ] [ s ] [ t ] [ u ] [ w ]_ALIGN_POINTERS_ |
_ostart.c |
basename |
basic data type |
big object |
bits_per_word |
booleans, BOOL.sign |
borrowing |
builtin data type |
charbuf |
CHARBUFSIZE |
closure object |
com |
command |
constant |
copy_closure |
copy_denotation |
copy_real |
copy_some |
copy_structured |
copy_type |
eta enrichment |
EVALn |
excl_denotation |
excl_real |
excl_structured |
excl_type |
FOREIGNSTRUCTS |
free_aux |
free_closure |
free_denotation |
free_real |
free_some |
free_structured |
free_type |
garbage collection |
get_denotation |
get_string |
handcoded structure |
handcoded structure, derived parts |
handcoded structure, implementation part |
handcoded structure, maintenance |
handcoded structure, signature part |
handcoded structure, source files |
i/o-handling |
identifier, transliteration of : [1], [2] |
integer |
is_empty_string(OBJ) |
is_primitive |
is_structured |
leng_denotation |
library, OPAL standard |
NAT |
natural numbers |
natural numbers, integral numbers |
NIL |
NORMSTRUCTS |
OPAL standard library |
rank |
reference counting |
RTS, intermediate interface level |
target language |
transliteration of identifiers : [1], [2] |
TUPn |
type : [1], [2], [3] |
type discipline |
unchecked arithmetic functions |
unpack_bool |
unpack_chunk_string |
unpack_nat |
unpack_pointer |
unpack_sword |
unpack_word |
WORD |
This document was generated 5 June 2001 (14:11:37) using the texi2html translator version 1.51-kd-pl15.