Before jumping in, it should be emphasized that typemaps are an advanced customization feature that provide access to some of SWIG's internals and low-level code generator. Furthermore, typemaps are generally not required to build a simple interface when first starting out. Therefore, the material in this chapter will appeal more to users who have already built a few simple interfaces, but would like to do more.
If you do nothing, SWIG produces a wrapper that expects to receive a pointer of type char ** as the second argument. For example, if you try to use the function you might get an error like this:int foo(int argc, char *argv[]);
One way to fix this problem is to write a few assist functions to manufacture an object of the appropriate type. For example:>>> foo(3,["ale","lager","stout"]) Traceback (most recent call last): File "", line 1, in ? TypeError: Type error. Expected _p_p_char >>>
Now in the scripting language:%inline %{ char **new_args(int maxarg) { return (char **) malloc(maxarg*sizeof(char *)); } void del_args(char **args, int narg) { while (--narg > 0) { free(args[narg]); } free(args); } void set_arg(char **args, int n, char *value) { args[n] = (char *) malloc(strlen(value)+1); strcpy(args[n],value); } %}
Needless to say, even though this works, it isn't the most user friendly interface. It would be much nicer if you could simply make a list of strings work like a char **. For example:>>> args = new_args(3) >>> args _000f4248_p_p_char >>> set_arg(args,0,"ale") >>> set_arg(args,1,"lager") >>> set_arg(args,2,"stout") >>> foo(3,args) >>> del_args(args,3)
An even better approach might implicitly set the argc parameter and allow the following:>>> foo(3, ["ale","lager","stout"])
Similar sorts of problems also arise when creating wrappers for small arrays, output values, and certain kinds of data structures.>>> foo(["ale","lager","stout"])
One of the reasons why SWIG does not provide automatic support for mapping scripting language objects such as lists and associative arrays into C is that C declarations often do not provide enough semantic information for SWIG to know exactly how this should be done. For example, if you have a function like this,
it's not obvious what the arguments are supposed to represent. Are they single values? Are they arrays? Is a result stored in one of the arguments?void foo(double *x, double *y, double *r);
Even in our earlier example, there are many possible interpretations of the char *argv[] argument. For example, is the array supposed to be NULL-terminated? Are the elements of the array modified by the underlying C function? Does the first element have any special meaning such as the name of a program or function?
The only thing that SWIG really knows about both of these cases is that the argument is some kind of pointer (and in fact, SWIG is perfectly happy to generate code that simply passes pointers around). Any further interpretation of the pointer's meaning requires a little more information.
void add(double a, double b, double *result) { *result = a + b; }
From reading the source code, it is clear that the function is storing a value in the double *result parameter. However, since SWIG does not examine source code, you need to give it additional information for the wrapper code to mimic this behavior. To do this, you can use the typemaps.i library file and write interface code like this:
The %apply directive tells SWIG that you are going to apply a typemap rule to a type. The "double *OUTPUT" specification is the name of a rule that defines how to return an output value from an argument of type double *. This rule gets applied to all of the datatypes listed in curly braces-- in this case "double *result".// Simple example using typemaps %module example %include "typemaps.i" %apply double *OUTPUT { double *result }; extern void add(double a, double b, double *result);
When the resulting module is created, you can now use the function like this (shown for Python):
In this case, you can see how the output value normally returned in the third argument has magically been transformed into a function return value. Clearly this makes the function much easier to use since it is no longer necessary to manufacture a special double * object and pass it to the function somehow.>>> a = add(3,4) >>> print a 7 >>>
Once a typemap has been applied to a type, it stays in effect for all future occurrences of the type and name. For example, you could write the following:
In this case, the double *OUTPUT rule is applied to all of the functions that follow.%module example %include "typemaps.i" %apply double *OUTPUT { double *result }; extern void add(double a, double b, double *result); extern void sub(double a, double b, double *result); extern void mul(double a, double b, double *result); extern void div(double a, double b, double *result); ...
Typemap transformations can even be extended to multiple return values. For example, consider this code:
In this case, the function returns multiple values, allowing it to be used like this:%include "typemaps.i" %apply int *OUTPUT { int *width, int *height }; // Returns a pair (width,height) void getwinsize(int winid, int *width, int *height);
>>> w,h = genwinsize(wid) >>> print w 400 >>> print h 300 >>>
It should also be noted that although the %apply directive is used to associate typemap rules to datatypes, you can also use the rule names directly in arguments. For example, you could write this:
Typemaps stay in effect until they are explicitly deleted or redefined to something else. To clear a typemap, the %clear directive should be used. For example:// Simple example using typemaps %module example %include "typemaps.i" extern void add(double a, double b, double *OUTPUT);
%clear double *result; // Remove all typemaps for double *result
The following typemaps instruct SWIG that a pointer really only holds a single input value:
When used, it allows values to be passed instead of pointers. For example, consider this function:int *INPUT short *INPUT long *INPUT unsigned int *INPUT unsigned short *INPUT unsigned long *INPUT double *INPUT float *INPUT
Now, consider this SWIG interface:double add(double *a, double *b) { return *a+*b; }
When the function is used in the scripting language interpreter, it will work like this:%module example %include "typemaps.i" ... extern double add(double *INPUT, double *INPUT);
result = add(3,4)
These methods can be used as shown in an earlier example. For example, if you have this C function :int *OUTPUT short *OUTPUT long *OUTPUT unsigned int *OUTPUT unsigned short *OUTPUT unsigned long *OUTPUT double *OUTPUT float *OUTPUT
void add(double a, double b, double *c) { *c = a+b; }
A SWIG interface file might look like this :
In this case, only a single output value is returned, but this is not a restriction. An arbitrary number of output values can be returned by applying the output rules to more than one argument (as shown previously).%module example %include "typemaps.i" ... extern void add(double a, double b, double *OUTPUT);
If the function also returns a value, it is returned along with the argument. For example, if you had this:
The function will return two values like this:extern int foo(double a, double b, double *OUTPUT);
iresult, dresult = foo(3.5, 2)
A C function that uses this might be something like this:int *INOUT short *INOUT long *INOUT unsigned int *INOUT unsigned short *INOUT unsigned long *INOUT double *INOUT float *INOUT
To make x function as both and input and output value, declare the function like this in an interface file :void negate(double *x) { *x = -(*x); }
Now within a script, you can simply call the function normally :%module example %include typemaps.i ... extern void negate(double *INOUT);
One subtle point of the INOUT rule is that many scripting languages enforce mutability constraints on primitive objects (meaning that simple objects like integers and strings aren't supposed to change). Because of this, you can't just modify the object's value in place as the underlying C function does in this example. Therefore, the INOUT rule returns the modified value as a new object rather than directly overwriting the value of the original input object.a = negate(3); # a = -3 after calling this
Compatibility note : The INOUT rule used to be known as BOTH in earlier versions of SWIG. Backwards compatibility is preserved, but deprecated.
To clear a rule, the %clear directive is used:// Make double *result an output value %apply double *OUTPUT { double *result }; // Make Int32 *in an input value %apply int *INPUT { Int32 *in }; // Make long *x inout %apply long *INOUT {long *x};
Typemap declarations are lexically scoped so a typemap takes effect from the point of definition to the end of the file or a matching %clear declaration.%clear double *result; %clear Int32 *in, long *x;
The behavior of this file is exactly as you would expect. If any of the arguments violate the constraint condition, a scripting language exception will be raised. As a result, it is possible to catch bad values, prevent mysterious program crashes and so on.// Interface file with constraints %module example %include "constraints.i" double exp(double x); double log(double POSITIVE); // Allow only positive values double sqrt(double NONNEGATIVE); // Non-negative values only double inv(double NONZERO); // Non-zero values void free(void *NONNULL); // Non-NULL pointers only
POSITIVE Any number > 0 (not zero) NEGATIVE Any number < 0 (not zero) NONNEGATIVE Any number >= 0 NONPOSITIVE Any number <= 0 NONZERO Nonzero number NONNULL Non-NULL pointer (pointers only).
The special types of "Number" and "Pointer" can be applied to any numeric and pointer variable type respectively. To later remove a constraint, the %clear directive can be used :// Apply a constraint to a Real variable %apply Number POSITIVE { Real in }; // Apply a constraint to a pointer type %apply Pointer NONNULL { Vector * };
%clear Real in; %clear Vector *;
Before diving in, it needs to be stressed that under normal conditions, SWIG does NOT require users to write new typemaps (and even when they are used, it is probably better to use them sparingly). A common confusion among some new users to SWIG is that they somehow need to write typemaps to handle new types when in fact they really only need to use a typedef declaration. For example, if you have a declaration like this,
you really only need to supply an appropriate typedef to make it work. For example:void blah(size_t len);
Typemaps are only used if you want to change the way that SWIG actually generates its wrapper code. For example, if you needed to express size_t as a string of roman numerals to preserve backwards compatibility with some piece of legacy software. A more practical application is the conversion of common scripting language objects such as lists and associative arrays into C datatypes. For example, converting a list of strings into a char *[] as shown in the first part of this chapter.typedef unsigned long size_t; void blah(size_t len);
Before proceding, you should first ask yourself if it is really necessary to change SWIG's default behavior. Next, you need to be aware that writing a typemap from scratch usually requires a detailed knowledge of the internal C API of the target language. Finally, it should also be stressed that by writing typemaps, it is easy to break all of the output code generated by SWIG. With these risks in mind, this section describes the basics of the SWIG type system and typemap construction. Language specific information is contained in later chapters.
The corresponding wrapper code will look approximately like this:void func(..., type, ...);
The relationship between the real C++ datatype and its ltype value is determined by the following rules:wrap_func(args) { ... ltype argn; // Local type for argument ... argn = ConvertValue(args[n]); ... func(..., (type) argn, ...); // Cast back to type ... }
In certain cases, names defined with typedef are also expanded. For example, if you have a type defined by a typedef as follows:type ltype ---------------------- -------------------- object object object * object * const object * object * const object * const object * object & object * object [10] object * object [10][20] object (*)[20]
the ltype of Matrix is set to double (*)[4] since there is no way for SWIG to create an assignable variable using variations of the Matrix typename.typedef double Matrix[4][4];
It should be stressed that these rules also define the behavior of the SWIG run-time type checker. Specifically, all of the type checking described in Chapter 3 is actually performed using ltype values and not the actual C datatype. This explains why, for instance, there is no difference between pointers, references, and one-dimensional arrays when they are used in the corresponding scripting language module. It also explains why qualifiers never appear in the mangled type-names used for type checking.
%module example %typemap(in) int { $1 = (int) PyInt_AsLong($input); if (PyErr_Occurred()) return NULL; printf("received %d\n", $1); } int add(int a, int b);
In this case, the typemap defines a rule for handling input arguments in Python. When used in a Python script, you would get the following output:
In the typemap specification, the symbols $1 and $input are place holders for C variable names in the generated wrapper code. In this case, $input is a variable containing the raw Python object supplied as input and $1 is the variable used to hold the converted value (in this case, $1 will be a C local variable of type int). The $1 variable is always the ltype of the type supplied to the %typemap directive.>>> a = add(7,13) received 7 received 13
To support different code-generation tasks, a variety of different typemaps can be defined. For example, the following typemap specifies how to result integers back to Python
In this case, $result refers to the Python object being returned to the interpreter and $1 refers to the variable holding the raw integer value being returned.%typemap(out) int { $result = PyInt_FromLong($1); }
At first, the use of $1 to refer to the local variable in a typemap may be counterintuitive. However, this notation is used to support an advanced feature of typemaps that allow rules to be written for groups of consecutive types. For example, a typemap can be written as follows:
In this case, the pattern (char *buffer, int len) is handled as a single object. Within the typemap code, $1 and $2 are used to refer to each part of the pattern. For instance, $1 is a variable of type char * and $2 is a variable of type int (for the the curious, this naming scheme is roughly taken from yacc). The use of multi-argument maps is an advanced topic and is described a little later.%typemap(int) (char *buffer, int len) { $1 = PyString_AsString($input); $2 = PyString_Size($input); } ... // Some function int count(char *buffer, int len, char *pattern);
Compatibility note: Prior to SWIG1.3.10, typemaps used the special variables $source and $target to refer to local variables used during conversion. Unfortunately, the roles of these variables was somewhat inconsistent (and in some places their meaning switched depending on the type of the typemap). In addition, this naming scheme is awkward when extended to multiple arguments. Although $source and $target are still supported for backwards compatibility, all future use is deprecated and may be eliminated in a future release. In new versions, the local variables corresponding to the types in the typemap specification are referenced using $1, $2, and so forth. Objects in the target language passed as input are referenced by $input. Objects returned to the target language are referenced by $result.
method defines a particular conversion method and type is the actual C++ datatype as it appears in the interface file (it is not the ltype value described in the section on the SWIG type system). The code attached to the typemap is supplied in braces, a quoted string, or in a %{ ... %} block after the typemap declaration. If braces are used, they are included in the output---meaning that new local variables can be declared inside the typemap code (the typemap defines a new C block scope). Otherwise, the code is inlined into the generated wrappers with no surrounding braces.%typemap(method) type { ... Conversion code ... } %typemap(method) type "Conversion code"; %typemap(method) type %{ ... Conversion code ... %}
Since typemap conversion code is almost always dependent on the target language, it is fairly common to surround typemap specifications with conditional compilation--especially if a module is being designed for use with multiple target languages. For example:
#ifdef SWIGPYTHON %typemap(in) int { $1 = (int) PyInt_AsLong($input); } ... #endif #ifdef SWIGPERL5 %typemap(in) int { $1 = (int) SvIV($input); } #endif
A single typemap rule can be applied to a list of matching datatypes by using a comma separated list. For example :
Here, $1_ltype is expanded into the local datatype used during code generation (this is the assignable version of the type described in the SWIG type system section). This form of specifying a typemap is a useful way to reduce the amount of typing required when the same typemap code might apply to a whole set of similar datatypes. Also, note that this syntax is not the same as a multi-argument typemap.%typemap(in) int, short, long, signed char { $1 = ($1_ltype) PyInt_AsLong($input); if (PyErr_Occurred()) return NULL; printf("received %d\n", $1); }
Typemaps may also be attached to specific declarator names as in:
A "named" typemap only applies to declarations that exactly match both the C datatype and the name. Thus the char **argv typemap will only be applied to function arguments that exactly match "char **argv". Although the name is usually the name of a parameter in a function/method declaration, certain typemaps are applied to return values (in which case, the name of the corresponding function or method is used).%typemap(in) char **argv { ... Turn an array into a char ** ... }
Typemaps can also be specified for a sequence of consecutive types by enclosing the types in parentheses. For example:
In this case, the typemap only applies to function/method arguments where the argument pair char *buffer, int len appears.%typemap(in) (char *buffer, int len) { ... }
%typemap(method) type; // Deletes this typemap
This specifies that the typemap for type should be the same as the srctype typemap. Here is an example:%typemap(method) type = srctype; // Copies a typemap
%typemap(in) long = int; // Copy a typemap with names %typemap(in) int_64 *output = long *output; // Copy a multi-argument typemap %typemap(in) (char *data, int size) = (char *buffer, int len);
typedef int Integer; %typemap(in) int *x { ... typemap 1 } %typemap(in) int * { ... typemap 2 } %typemap(in) Integer *x { ... typemap 3 } void A(int *x); // int *x rule (typemap 1) void B(int *y); // int * rule (typemap 2) void C(Integer *); // int * rule (via typedef) (typemap 2) void D(Integer *x); // Integer *x rule (typemap 3) void E(const int *); // int * rule (const stripped) (typemap 2)
When multi-argument typemaps are specified, they take precedence over any typemaps specified for a single type. For example:
%typemap(in) (char *buffer, int len) { // typemap 1 } %typemap(in) char *buffer { // typemap 2 } void foo(char *buffer, int len, int count); // (char *buffer, int len) void bar(char *buffer, int blah); // char *buffer
Compatibility note: SWIG1.1 applied a complex set of type-matching rules in which a typemap for int * would also match many different variations including int &, int [], and qualified variations. This feature is revoked in SWIG1.3. Typemaps must now exactly match the types and names used in the interface file.
Compatibility note: Starting in SWIG1.3, typemap matching follows typedef declarations if possible (as shown in the above example). This type of matching is only performed in one direction. For example, if you had typedef int Integer and then defined a typemap for Integer, that typemap would not be applied to the int datatype. Earlier versions of SWIG did not follow typedef declarations when matching typemaps. This feature has primarily been added to assist language modules that rely heavily on typemaps (e.g., a typemap for "int" now defines the default for integers regardless of what kind of typedef name is being used to actually refer to an integer in the source program).
Each typemap is applied in a very specific location within the wrapper functions generated by SWIG. Specifically, the general form a wrapper function is as follows:
To illustrate, it is often useful to write a simple interface file with some typemaps and to take a look at the generated wrapper code. To illustrate, consider the following interface file (with typemaps):_wrap_foo() { /* Marshal input values to C */ [ arginit typemaps ] [ ignore typemaps ] [ default typemaps ] [ in typemaps ] [ check typemaps ] /* Call the actual C function */ call foo /* Return result to target language */ [ out typemap ] [ argout typemaps ] [ freearg typemaps ] [ ret typemap ] }
Now, let's take a look at the resulting wrapper function (generated for Python, but it looks similar for other languages):%module example %typemap(in) int { // "in" typemap. $1, $input } %typemap(out) int { // "out" typemap. $1, $result } %typemap(ignore) int ignored { // "ignore" typemap. $1 } %typemap(check) int { // "check" typemap. $1 } %typemap(arginit) int *out { // "arginit" typemap. $1 } %typemap(argout) int *out { // "argout" typemap. $1, $result } %typemap(ret) int { // "ret" typemap. $1 } %typemap(freearg) int *out { // "freearg" typemap. $1 } int foo(int, int ignored, int *out);
In this example, you can see how different typemaps are used for different purposes. For example, the "arginit" and "ignore" typemaps appear first, and are used to initialize variables before anything else occurs. The "in" typemap is used to help convert arguments from the target language to C. The "check" typemap appears just prior to the actual function call and is used to validate arguments. After the function has been called, the "out" and "argout" typemaps are used to create an output values. Typically, "argout" appends its result to any result already set by the "out" typemap. The last two typemaps, "freearg" and "ret" are used to perform cleanup actions.static PyObject *_wrap_foo(PyObject *self, PyObject *args) { PyObject *resultobj; int arg0 ; int arg1 ; int *arg2 ; PyObject * obj0 = 0 ; PyObject * argo2 =0 ; int result ; { // "arginit" typemap. arg2 } { // "ignore" typemap. arg1 } if(!PyArg_ParseTuple(args,(char *)"OO:foo",&obj0,&argo2)) return NULL; { // "in" typemap. arg0, obj0 } if ((SWIG_ConvertPtr(argo2,(void **) &arg2,SWIGTYPE_p_int,1)) == -1) return NULL; { // "check" typemap. arg0 } { // "check" typemap. arg1 } result = (int )foo(arg0,arg1,arg2); { // "out" typemap. result, resultobj } { // "argout" typemap. arg2, resultobj } { // "freearg" typemap. arg2 } { // "ret" typemap. result } return resultobj; }
Within each typemap, the $1 is always replaced by a C local variable corresponding to that type. For example, you can see how $1 is replaced by arg0, arg1, and arg2 depending on the argument in question. For typemaps related to returning a result, $1 is set to the local variable holding the raw result of the function call (in this case, the variable result).
In this case, the local variable temp only exists inside the typemap code itself. It does not affect other variables in the wrapper function and it does not matter whether or not other typemaps happen to use the same variable name. Note: that if you specify typemap code using a string or a %{ ... %} block, the typemap code is not enclosed in braces like this.%typemap(in) int { int temp; temp = (int) SvIV($input); $1 = temp; }
In this case, temp becomes a local variable in the scope of the entire wrapper function. For example:%typemap(in) int *INPUT(int temp) { temp = (int) PyInt_AsLong($input); $1 = &temp; }
When you set temp to a value, it persists for the duration of the wrapper function and gets cleaned up automatically on exit. This is particularly useful when a typemap needs to create a temporary value, but doesn't want to rely on heap allocation.wrap_foo() { int temp; <--- Declaration of temp goes here ... /* Typemap code */ { temp = (int) PyInt_AsLong(...); ... } ... }
It is perfectly safe to use more than one typemap involving local variables in the same declaration. For example, you could declare a function as :
This is safely handled because SWIG actually renames all local variable references by appending an argument number suffix. Therefore, the generated code would actually look like this:void foo(int *INPUT, int *INPUT, int *INPUT);
wrap_foo() { int *arg1; /* Actual arguments */ int *arg2; int *arg3; int temp1; /* Locals declared in the INPUT typemap */ int temp2; int temp3; ... { temp1 = (int) PyInt_AsLong(...); arg1 = &temp1; } { temp2 = (int) PyInt_AsLong(...); arg2 = &temp2; } { temp3 = (int) PyInt_AsLong(...); arg3 = &temp3; } ... }
Some typemaps do not recognize local variables (or they may simply not apply). At this time, only the "in", "argout", "default", and "ignore" typemaps support local variables (typemaps that apply to conversion of arguments).
It is also important to note that the primary use of local variables is to create stack-allocated objects for temporary use inside a wrapper function (this is faster and less-prone to error than allocating data on the heap). In general, the variables are not intended to pass information between different types of typemaps. However, this can be done if you realize that local names have the argument number appended to them. For example, you could do this:
In this case, the $argnum variable is expanded into the argument number. Therefore, the code will reference the appropriate local such as temp1 and temp2. It should be noted that there are plenty of opportunities to break the universe here and that accessing locals in this manner should probably be avoided. At the very least, you should make sure that the typemaps sharing information have exactly the same types and names.%typemap(in) int *(int temp) { temp = (int) PyInt_AsLong($input); $1 = &temp; } %typemap(argout) int * { PyObject *o = PyInt_FromLong(temp$argnum); ... }
Variable | Meaning |
---|---|
$n | The C local variable corresponding to type n in the typemap declaration. |
$input | The input object for an argument in the target language. This is only available in typemaps related to argument conversion. |
$result | The output object being returned by a wrapper function. |
$argnum | Argument number. Only available in typemaps related to argument conversion |
$n_name | Argument name |
$n_type | Real C datatype of type n. |
$n_ltype | ltype of type n |
$n_mangle | Mangled form of type n. For example _p_Foo |
$n_descriptor | Type descriptor structure for type n. For example SWIGTYPE_p_Foo. This is primarily used when interacting with the run-time type checker (described later). |
$*n_type | Real C datatype of type n with one pointer removed. |
$*n_ltype | ltype of type n with one pointer removed. |
$*n_mangle | Mangled form of type n with one pointer removed. |
$*n_descriptor | Type descriptor structure for type n with one pointer removed. |
$&n_type | Real C datatype of type n with one pointer added. |
$&n_ltype | ltype of type n with one pointer added. |
$&n_mangle | Mangled form of type n with one pointer added. |
$&n_descriptor | Type descriptor structure for type n with one pointer added. |
$n_basetype | Base typename with all pointers and qualifiers stripped. |
Within the table, $n refers to a specific type within the typemap specification. For example, if you write this
then $1 refers to int *INPUT. If you have a typemap like this,%typemap(in) int *INPUT { }
then $1 refers to int argc and $2 refers to char *argv[].%typemap(in) (int argc, char *argv[]) { ... }
Substitutions related to types and names always fill in values from the actual code that was matched. This is useful when a typemap might match multiple C datatype. For example:
In this case, $1_ltype is replaced with the datatype that is actually matched.%typemap(in) int, short, long { $1 = ($1_ltype) PyInt_AsLong($input); }
Variables such as $&1_type and $*1_type are used to safely modify the type by removing or adding pointers. Although not needed in most typemaps, these substitutions are sometimes needed to properly work with typemaps that convert values between pointers and values.
If necessary, type related substitutions can also be used when declaring locals. For example:
%typemap(in) int * ($*1_type temp) { temp = PyInt_AsLong($input); $1 = &temp; }
There is one word of caution about declaring local variables in this manner. If you declare a local variable using a type substitution such as $1_ltype temp, it won't work like you expect for arrays and certain kinds of pointers. For example, if you wrote this,
then the declaration of temp will be expanded as%typemap(in) int [10][20] { $1_ltype temp; }
This is illegal C syntax and won't compile. There is currently no straightforward way to work around this problem in SWIG due to the way that typemap code is expanded and processed. However, one possible workaround is to simply pick an alternative type such as void * and use casts to get the correct type when needed. For example:int (*)[20] temp;
Another approach, which only works for arrays is to use the $1_basetype substitution. For example:%typemap(in) int [10][20] { void *temp; ... (($1_ltype) temp)[i][j] = x; /* set a value */ ... }
%typemap(in) int [10][20] { $1_basetype temp[10][20]; ... temp[i][j] = x; /* set a value */ ... }
A common usage of this feature is to alter the behavior of documentation strings or usage information. For example:%typemap(in,name="value",name="value",...) ...
The exact set of recognized parameters depends on the target language. Further details are covered in the documentation for each language module.%typemap(in,doc="4-tuple") int [4] { ... }
For example, suppose you had a function like this:
If you wanted to handle float value[4] as a list of floats, you might write a typemap similar to this:void set_vector(int type, float value[4]);
In this example, the variable temp allocates a small array on the C stack. The typemap then populates this array and passes it to the underlying C function.%typemap(in) float value[4] (float temp[4]) { int i; if (!PySequence_Check($input)) { PyErr_SetString(PyExc_ValueError,"Expected a sequence"); return NULL; } if (PySequence_Length($input) != 4) { PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected 4 elements"); return NULL; } for (i = 0; i < 4; i++) { PyObject *o = PySequence_GetItem($input,i); if (PyNumber_Check(o)) { temp[i] = (float) PyFloat_AsDouble(o); } else { PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers"); return NULL; } } $1 = temp; }
When used from Python, the typemap allows the following type of function call:
>>> set_vector(type, [ 1, 2.5, 5, 20 ])
If you wanted to generalize the typemap to apply to arrays of all dimensions you might write this:
In this example, the special variable $1_dim0 is expanded with the actual array dimensions. Multidimensional arrays can be matched in a similar manner. For example:%typemap(in) float value[ANY] (float temp[$1_dim0]) { int i; if (!PySequence_Check($input)) { PyErr_SetString(PyExc_ValueError,"Expected a sequence"); return NULL; } if (PySequence_Length($input) != $1_dim0) { PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements"); return NULL; } for (i = 0; i < $1_dim0; i++) { PyObject *o = PySequence_GetItem($input,i); if (PyNumber_Check(o)) { temp[i] = (float) PyFloat_AsDouble(o); } else { PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers"); return NULL; } } $1 = temp; }
For large arrays, it may be impractical to allocate storage on the stack using a temporary variable as shown. To work with heap allocated data, the following technique can be used.%typemap(python,in) float matrix[ANY][ANY] (float temp[$1_dim0][$1_dim1]) { ... convert a 2d array ... }
In this case, an array is allocated using malloc. The freearg typemap is then used to release the argument after the function has been called.%typemap(in) float value[ANY] { int i; if (!PySequence_Check($input)) { PyErr_SetString(PyExc_ValueError,"Expected a sequence"); return NULL; } if (PySequence_Length($input) != $1_dim0) { PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements"); return NULL; } $1 = (float) malloc($1_dim0*sizeof(float)); for (i = 0; i < $1_dim0; i++) { PyObject *o = PySequence_GetItem($input,i); if (PyNumber_Check(o)) { $1[i] = (float) PyFloat_AsDouble(o); } else { PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers"); return NULL; } } } %typemap(freearg) float value[ANY] { if ($1) free($1); }
Another common use of array typemaps is to provide support for array structure members. Due to subtle differences between pointers and arrays in C, you can't just "assign" to a array structure member. Instead, you have to explicitly copy elements into the array. For example, suppose you had a structure like this:
When SWIG runs, it won't produce any code to set the vec member. You may even get a warning message like this:struct SomeObject { float value[4]; ... };
swig -python example.i Generating wrappers for Python example.i:10. Warning. Array member value will be read-only.
These warning messages indicate that SWIG does not know how you want to set the vec field.
To fix this, you can supply a special "memberin" typemap like this:
%typemap(memberin) float [ANY] { int i; for (i = 0; i < $1_dim0; i++) { $1[i] = $input[i]; } }
The memberin typemap is used to set a structure member from data that has already been converted from the target language to C. In this case, $input is the local variable in which converted input data is stored. This typemap then copies this data into the structure.
When combined with the earlier typemaps for arrays, the combination of the "in" and "memberin" typemap allows the following usage:
>>> s = SomeObject() >>> s.x = [1, 2.5, 5, 10]
Related to structure member input, it may be desirable to return structure members as a new kind of object. For example, in this example, you will get very odd program behavior where the structure member can be set nicely, but reading the member simply returns a pointer:
To fix this, you can write an "out" typemap. For example:>>> s = SomeObject() >>> s.x = [1, 2.5, 5. 10] >>> print s.x _1008fea8_p_float >>>
Now, you will find that member access is quite nice:%typemap(out) float [ANY] { int i; $result = PyList_New($1_dim0); for (i = 0; i < $1_dim0; i++) { PyObject *o = PyFloat_FromDouble((double) $1[i]); PyList_SetItem($result,i,o); } }
Compatibility Note: SWIG1.1 used to provide a special "memberout" typemap. However, it was mostly useless and has since been eliminated. To return structure members, simply use the "out" typemap.>>> s = SomeObject() >>> s.x = [1, 2.5, 5, 10] >>> print s.x [ 1, 2.5, 5, 10]
This provides a sanity check to your wrapper function. If a negative number is passed to this function, a Perl exception will be raised and your program terminated with an error message.%module math %typemap(check) double posdouble { if ($1 < 0) { croak("Expecting a positive number"); } } ... double sqrt(double posdouble);
This kind of checking can be particularly useful when working with pointers. For example :
will prevent any function involving a Vector * from accepting a NULL pointer. As a result, SWIG can often prevent a potential segmentation faults or other run-time problems by raising an exception rather than blindly passing values to the underlying C/C++ program.%typemap(check) Vector * { if ($1 == 0) { PyErr_SetString(PyExc_TypeError,"NULL Pointer not allowed"); return NULL; } }
Note: A more advanced constraint checking system is in development. Stay tuned.
Suppose that you wanted to wrap this function so that it accepted a single list of strings like this:int foo(int argc, char *argv[]);
To do this, you not only need to map a list of strings to char *argv[], but the value of int argc is implicitly determined by the length of the list. Using only simple typemaps, this type of conversion is possible, but extremely painful. Therefore, SWIG1.3 introduces the notion of multi-argument typemaps.>>> foo(["ale","lager","stout"])
A multi-argument typemap is a conversion rule that specifies how to convert a single object in the target language to set of consecutive function arguments in C/C++. For example, the following multi-argument maps perform the conversion described for the above example:
A multi-argument map is always specified by surrounding the arguments with parentheses as shown. For example:%typemap(in) (int argc, char *argv[]) { int i; if (!PyList_Check($input)) { PyErr_SetString(PyExc_ValueError, "Expecting a list"); return NULL; } $1 = PyList_Size($input); $2 = (char **) malloc(($1+1)*sizeof(char *)); for (i = 0; i < $1; i++) { PyObject *s = PyList_GetItem($input,i); if (!PyString_Check(s)) { free($2); PyErr_SetString(PyExc_ValueError, "List items must be strings"); return NULL; } $2[i] = PyString_AsString(s); } $2[i] = 0; } %typemap(freearg) (int argc, char *argv[]) { if ($2) free($2); }
Within the typemap code, the variables $1, $2, and so forth refer to each type in the map. All of the usual substitutions apply--just use the appropriate $1 or $2 prefix on the variable name (e.g., $2_type, $1_ltype, etc.)%typemap(in) (int argc, char *argv[]) { ... }
Multi-argument typemaps always have precedence over simple typemaps and SWIG always performs longest-match searching. Therefore, you will get the following behavior:
It should be stressed that multi-argument typemaps can appear anywhere in a function declaration and can appear more than once. For example, you could write this:%typemap(in) int argc { ... typemap 1 ... } %typemap(in) (int argc, char *argv[]) { ... typemap 2 ... } %typemap(in) (int argc, char *argv[], char *env[]) { ... typemap 3 ... } int foo(int argc, char *argv[]); // Uses typemap 2 int bar(int argc, int x); // Uses typemap 1 int spam(int argc, char *argv[], char *env[]); // Uses typemap 3
Other directives such as %apply and %clear also work with multi-argument maps. For example:%typemap(in) (int scount, char *swords[]) { ... } %typemap(in) (int wcount, char *words[]) { ... } void search_words(int scount, char *swords[], int wcount, char *words[], int maxcount);
Although multi-argument typemaps may seem like an exotic, little used feature, there are several situations where they make sense. First, suppose you wanted to wrap functions similar to the low-level read() and write() system calls. For example:%apply (int argc, char *argv[]) { (int scount, char *swords[]), (int wcount, char *words[]) }; ... %clear (int scount, char *swords[]), (int wcount, char *words[]); ...
As is, the only way to use the functions would be to allocate memory and pass some kind of pointer as the second argument---a process that might require the use of a helper function. However, using multi-argument maps, the functions can be transformed into something more natural. For example, you might write typemaps like this:typedef unsigned int size_t; int read(int fd, void *rbuffer, size_t len); int write(int fd, void *wbuffer, size_t len);
(note: In the above example, $result and result are two different variables. result is the real C datatype that was returned by the function. $result is the scripting language object being returned to the interpreter.).// typemap for an outgoing buffer %typemap(in) (void *wbuffer, size_t len) { if (!PyString_Check($input)) { PyErr_SetString(PyExc_ValueError, "Expecting a string"); return NULL; } $1 = (void *) PyString_AsString($input); $2 = PyString_Size($input); } // typemap for an incoming buffer %typemap(in) (void *rbuffer, size_t len) { if (!PyInt_Check($input)) { PyErr_SetString(PyExc_ValueError, "Expecting an integer"); return NULL; } $2 = PyInt_AsLong($input); if ($2 < 0) { PyErr_SetString(PyExc_ValueError, "Positive integer expected"); return NULL; } $1 = (void *) malloc($2); } // Return the buffer. Discarding any previous return result %typemap(argout) (void *rbuffer, size_t len) { Py_XDECREF($result); /* Blow away any previous result */ if (result < 0) { /* Check for I/O error */ free($1); PyErr_SetFromErrno(PyExc_IOError); return NULL; } $result = PyString_FromStringAndSize($1,result); free($1); }
Now, in a script, you can write code that simply passes buffers as strings like this:
A number of multi-argument typemap problems also arise in libraries that perform matrix-calculations--especially if they are mapped onto low-level Fortran or C code. For example, you might have a function like this:>>> f = example.open("Makefile") >>> example.read(f,40) 'TOP = ../..\nSWIG = $(TOP)/.' >>> example.read(f,40) './swig\nSRCS = example.c\nTARGET ' >>> example.close(f) 0 >>> g = example.open("foo", example.O_WRONLY | example.O_CREAT, 0644) >>> example.write(g,"Hello world\n") 12 >>> example.write(g,"This is a test\n") 15 >>> example.close(g) 0 >>>
In this case, you might want to pass some kind of higher-level object as an matrix. To do this, you could write a multi-argument typemap like this:int is_symmetric(double *mat, int rows, int columns);
This kind of technique can be used to hook into scripting-language matrix packages such as Numeric Python. However, it should also be stressed that some care is in order. For example, when crossing languages you may need to worry about issues such as row-major vs. column-major ordering (and perform conversions if needed).%typemap(in) (double *mat, int rows, int columns) { MatrixObject *a; a = GetMatrixFromObject($input); /* Get matrix somehow */ /* Get matrix properties */ $1 = GetPointer(a); $2 = GetRows(a); $3 = GetColumns(a); }
Since typemap matching follows all typedef declarations, any sort of type that is mapped to a primitive type through typedef will be picked up by one of these primitive typemaps.%typemap(in) int "convert an int"; %typemap(in) short "convert a short"; %typemap(in) float "convert a float"; ...
The default behavior for pointers, arrays, references, and other kinds of types are handled by specifying rules for variations of the reserved SWIGTYPE type. For example:
These rules match any kind of pointer, reference, or array--even when multiple levels of indirection or multiple array dimensions are used. Therefore, if you wanted to change SWIG's default handling for all types of pointers, you would simply redefine the rule for SWIGTYPE *.%typemap(in) SWIGTYPE * { ... default pointer handling ... } %typemap(in) SWIGTYPE & { ... default reference handling ... } %typemap(in) SWIGTYPE [] { ... default array handling ... } %typemap(in) enum SWIGTYPE { ... default handling for enum values ... } %typemap(in) SWIGTYPE (CLASS::*) { ... default pointer member handling ... }
Finally, the following typemap rule is used to match against simple types that don't match any other rules:
This typemap is important because it is the rule that gets triggered when call or return by value is used. For instance, if you have a declaration like this:%typemap(in) SWIGTYPE { ... handle an unknown type ... }
The Vector type will usually just get matched against SWIGTYPE. The default implementation of SWIGTYPE is to convert the value into pointers (as described in chapter 3).double dot_product(Vector a, Vector b);
By redefining SWIGTYPE it may be possible to implement other behavior. For example, if you cleared all typemaps for SWIGTYPE, SWIG simply won't wrap any unknown datatype (which might be useful for debugging). Alternatively, you might modify SWIGTYPE to marshal objects into strings instead of converting them to pointers.
The best way to explore the default typemaps is to look at the ones already defined for a particular language module. Typemaps definitions are usually found in the SWIG library in a file such as python.swg, tcl8.swg, etc.
At a basic level, the type checker simply restores some type-safety to extension modules. However, the type checker is also responsible for making sure that wrapped C++ classes are handled correctly---especially when inheritance is used. This is especially important when an extension module makes use of multiple inheritance. For example:_108e688_p_Foo
When the class FooBar is organized in memory, it contains the contents of the classes Foo and Bar as well as its own data members. For example:class Foo { int x; }; class Bar { int y; }; class FooBar : public Foo, public Bar { int z; };
Because of the way that base class data is stacked together, the casting of a Foobar * to either of the base classes may change the actual value of the pointer. This means that it is generally not safe to represent pointers using a simple integer or a bare void *---type tags are needed to implement correct handling of pointer values (and to make adjustments when needed).FooBar --> | -----------| <-- Foo | int x | |------------| <-- Bar | int y | |------------| | int z | |------------|
In the wrapper code generated for each language, pointers are handled through the use of special type descriptors and conversion functions. For example, if you look at the wrapper code for Python, you will see code like this:
In this code, SWIGTYPE_p_Foo is the type descriptor that describes Foo *. The type descriptor is actually a pointer to a structure that contains information about the type name to use in the target language, a list of equivalent typenames (via typedef or inheritance), and pointer value handling information (if applicable). The SWIG_ConvertPtr() function is simply a utility function that takes a pointer object in the target language and a type-descriptor objects and uses this information to generate a C++ pointer. However, the exact name and calling conventions of the conversion function depends on the target language (see language specific chapters for details).if ((SWIG_ConvertPtr(obj0,(void **) &arg1, SWIGTYPE_p_Foo,1)) == -1) return NULL;
When pointers are converted in a typemap, the typemap code often looks similar to this:
The most critical part is the typemap is the use of the $1_descriptor special variable. When placed in a typemap, this is expanded into the SWIGTYPE_* type descriptor object above. As a general rule, you should always use $1_descriptor instead of trying to hard-code the type descriptor name directly.%typemap(in) Foo * { if ((SWIG_ConvertPtr($input, (void **) &$1, $1_descriptor)) == -1) return NULL; }
There is another reason why you should always use the $1_descriptor variable. When this special variable is expanded, SWIG marks the corresponding type as "in use." When type-tables and type information is emitted in the wrapper file, descriptor information is only generated for those datatypes that were actually used in the interface. This greatly reduces the size of the type tables and improves efficiency.
In certain cases, SWIG may not generate type-descriptors like you expect. For example, if you are converting pointers in some non-standard way or working with an unusual combination of interface files and modules, you may find that SWIG omits information for a specific type descriptor. To fix this, you may need to use the %types directive. For example:
When %types is used, SWIG generates type-descriptor information even if those datatypes never appear elsewhere in the interface file.%types(int *, short *, long *, float *, double *);
A final problem related to the type-checker is the conversion of types in code that is external to the SWIG wrapper file. This situation is somewhat rare in practice, but occasionally a programmer may want to convert a typed pointer object into a C++ pointer somewhere else in their program. The only problem is that the SWIG type descriptor objects are only defined in the wrapper code and not normally accessible.
To correctly deal with this situation, the following technique can be used:
Further details about the run-time type checking can be found in the documentation for individual language modules. Reading the source code may also help. The file common.swg in the SWIG library contains all of the source code for type-checking. This code is also included in every generated wrapped file so you probably just look at the output of SWIG to get a better sense for how types are managed./* Some non-SWIG file */ /* External declarations */ extern void *SWIG_TypeQuery(const char *); extern int SWIG_ConvertPtr(PyObject *, void **ptr, void *descr); void foo(PyObject *o) { Foo *f; static void *descr = 0; if (!descr) { descr = SWIG_TypeQuery("Foo *"); /* Get the type descriptor structure for Foo */ assert(descr); } if ((SWIG_ConvertPtr(o,(void **) &f, descr) == -1)) { abort(); } ... }
To make it easier to apply the typemap to different argument types and names, the %apply directive performs a copy of all typemaps from one type to another. For example, if you specify this,%typemap(ignore) int *OUTPUT (int temp) { $1 = &temp; } %typemap(argout) int *OUTPUT { // return value somehow }
then all of the int *OUTPUT typemaps are copied to int *retvalue and int32 *output.%apply int *OUTPUT { int *retvalue, int32 *output };
However, there is a subtle aspect of %apply that needs more description. Namely, %apply does not overwrite a typemap rule if it is already defined for the target datatype. This behavior allows you to do two things:
Since %apply does not overwrite or replace any existing rules, the only way to reset behavior is to use the %clear directive. %clear removes all typemap rules defined for a specific datatype. For example:%typemap(in) int *INPUT (int temp) { temp = ... get value from $input ...; $1 = &temp; } %typemap(check) int *POSITIVE { if (*$1 <= 0) { SWIG_exception(SWIG_ValueError,"Expected a positive number!\n"); return NULL; } } ... %apply int *INPUT { int *invalue }; %apply int *POSITIVE { int *invalue };
%clear int *invalue;
If you had a large interface with hundreds of functions all accepting array parameters, this typemap would be replicated repeatedly--generating a huge amount of huge. A better approach might be to consolidate some of the typemap into a function. For example:%typemap(in) float [ANY] { int i; if (!PySequence_Check($input)) { PyErr_SetString(PyExc_ValueError,"Expected a sequence"); return NULL; } if (PySequence_Length($input) != $1_dim0) { PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements"); return NULL; } $1 = (float) malloc($1_dim0*sizeof(float)); for (i = 0; i < $1_dim0; i++) { PyObject *o = PySequence_GetItem($input,i); if (PyNumber_Check(o)) { $1[i] = (float) PyFloat_AsDouble(o); } else { PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers"); return NULL; } } }
%{ /* Define a helper function */ static float * convert_float_array(PyObject *input, int size) { int i; float *result; if (!PySequence_Check(input)) { PyErr_SetString(PyExc_ValueError,"Expected a sequence"); return NULL; } if (PySequence_Length(input) != size) { PyErr_SetString(PyExc_ValueError,"Size mismatch. "); return NULL; } result = (float) malloc(size*sizeof(float)); for (i = 0; i < size; i++) { PyObject *o = PySequence_GetItem(input,i); if (PyNumber_Check(o)) { result[i] = (float) PyFloat_AsDouble(o); } else { PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers"); free(result); return NULL; } } return result; } %} %typemap(in) float [ANY] { $1 = convert_float_array($input, $1_dim0); if (!$1) return NULL; } %}