Additional options are often defined for each target language. A full list can be obtained by typing swig -help or swig -lang -help.swig [ options ] filename -tcl Generate Tcl wrappers -perl Generate Perl5 wrappers -python Generate Python wrappers -guile Generate Guile wrappers -ruby Generate Ruby wrappers -java Generate Java wrappers -mzscheme Generate mzscheme wrappers -php Generate PHP wrappers -c++ Enable C++ parsing -Idir Add a directory to the file include path -lfile Include a SWIG library file. -c Generate raw wrapper code (omit supporting code) -o outfile Name of output file -module name Set the name of the SWIG module -Dsymbol Define a preprocessor symbol -version Show SWIG version number -swiglib Show location of SWIG library -help Display all options
The most common format of a SWIG interface is as follows:
The name of the module is supplied using the special %module directive (or the -module command line option). This directive must appear at the beginning of the file and is used to name the resulting extension module (in addition, this name often defines a namespace in the target language). If the module name is supplied on the command line, it overrides the name specified with the %module directive.%module mymodule %{ #include "myheader.h" %} // Now list ANSI C/C++ declarations int foo; int bar(int x); ...
Everything in the %{ ... %} block is simply copied to the resulting output file. The enclosed text is not parsed or interpreted by SWIG. Although the use of a %{,%} block is optional, most interface files have one to include header files and other supporting C declarations. The %{...%} syntax and semantics in SWIG is analogous to that of the declarations section used in input files to parser generation tools such as yacc or bison.
$ swig -c++ -python -o example_wrap.cpp example.i
The output file created by SWIG normally contains everything that is needed to construct a extension module for the target scripting language. SWIG is not a stub compiler nor is usually necessary to edit the output file (and if you look at the output, you probably won't want to). To build the final extension module, the SWIG output file is compiled and linked with the rest of your C/C++ program to create a shared library.
It should also be noted that the SWIG preprocessor skips all text enclosed inside a %{...%} block. In addition, the preprocessor includes a number of macro handling enhancements that make it more powerful than the normal C preprocessor. These extensions are described in the "Preprocessor" section near the end of this chapter.
Since SWIG directives are not legal C syntax, it is generally not possible to include them in header files. However, SWIG directives can be included in C header files using conditional compilation like this:
SWIG is a special preprocessing symbol defined by SWIG when it is parsing an input file./* header.h --- Some header file */ /* SWIG directives -- only seen if SWIG is running */ #ifdef SWIG %module foo #endif
In practice, few (if any) C programmers actually write code like is since this style is never featured in programming books. However, if you're feeling particularly obfuscated, you can certainly break SWIG./* Non-conventional placement of storage specifier (extern) */ const int extern Number; /* Function declaration with unnecessary grouping */ int (foo)(int,int); ...
/* Not supported by SWIG */ int foo::bar(int) { ... whatever ... }
In the event of a parsing error, conditional compilation can be used to skip offending code. For example:
Alternatively, you can just delete the offending code from the interface file.#ifndef SWIG ... some bad declarations ... #endif
One of the reasons why SWIG does not provide a full C++ parser implementation is that it has been designed to work with incomplete specifications and to be very permissive in its handling of C/C++ datatypes (e.g., SWIG can generate interfaces even when there are missing class declarations or opaque datatypes). Unfortunately, this approach makes it extremely difficult to implement certain parts of a C/C++ parser as most compilers use type information to assist in the parsing of more complex declarations (for the truly curious, the primary complication in the implementation is that the SWIG parser does not utilize a separate typedef-name terminal symbol as described on p. 234 of K&R).
It should also be noted that the SWIG parser was never really developed with the intent that it would be blindly used on raw C/C++ source code. Although parsing has become a lot more powerful in recent versions, the underlying assumption was that one would usually start with a header and enhance it by adding additional support code, cutting certain features out, supplying special SWIG directives, and so forth.
In this file, there are two functions sin() and strcmp(), a global variable Foo, and two constants STATUS and VERSION. When SWIG creates an extension module, these declarations are accessible as scripting language functions, variables, and constants respectively. For example, in Tcl:%module example extern double sin(double x); extern int strcmp(const char *, const char *); extern int Foo; #define STATUS 50 #define VERSION "1.1"
Or in Python:% sin 3 5.2335956 % strcmp Dave Mike -1 % puts $Foo 42 % puts $STATUS 50 % puts $VERSION 1.1
Whenever possible, SWIG creates an interface that closely matches the underlying C/C++ code. However, due to subtle differences between languages, run-time environments, and semantics, it is not always possible to do so. The next few sections describes various aspects of this mapping.>>> example.sin(3) 5.2335956 >>> example.strcmp('Dave','Mike') -1 >>> print example.cvar.Foo 42 >>> print example.STATUS 50 >>> print example.VERSION 1.1
Most scripting languages provide a single integer type that is implemented using the int or long datatype in C. The following list shows all of the C datatypes that SWIG will convert to and from integers in the target language:
int short long unsigned signed unsigned short unsigned long unsigned char signed char bool
When an integral value is converted from C, a cast is used to convert it to the representation in the target language. Thus, a 16 bit short in C may be promoted to a 32 bit integer. When integers are converted in the other direction, the value is cast back into the original C type. If the value is too large to fit, it is silently truncated.
unsigned char and signed char are special cases that are handled as small 8-bit integers. Normally, the char datatype is mapped as a one-character ASCII string.
The bool datatype is cast to and from an integer value of 0 and 1 unless the target language provides a special boolean type.
Some care is required when working with large integer values. Most scripting languages use 32-bit integers so mapping a 64-bit long integer may lead to truncation errors. Similar problems may arise with 32 bit unsigned integers (which may appear as large negative numbers). As a rule of thumb, the int datatype and all variations of char and short datatypes are safe to use. For unsigned int and long datatypes, you will need to carefully check the correct operation of your program after it has been wrapped with SWIG.
Although the SWIG parser supports the long long datatype, very few language modules currently support it. This is because long long usually exceeds the precision available in the target language. This limitation may be eliminated in future SWIG releases.
SWIG recognizes the following floating point types :
float double
Floating point numbers are mapped to and from the natural representation of floats in the target language. This is almost always a C double. The rarely used datatype of long double is not supported by SWIG.
The char datatype is mapped into a NULL terminated ASCII string with a single character. When used in a scripting language it shows up as a tiny string containing the character value. When converting the value back into C, SWIG takes a character string from the scripting language and strips off the first character as the char value. Thus if the value "foo" is assigned to a char datatype, it gets the value `f'.
The char * datatype is handled as a NULL-terminated ASCII string. SWIG maps this into a 8-bit character string in the target scripting language. SWIG converts character strings in the target language to NULL terminated strings before passing them into C/C++. It is illegal for these strings to have embedded NULL bytes. Therefore, the char * datatype is not generally suitable for passing binary data (although typemaps can be used to handle this).
At this time, SWIG does not provide any special support for Unicode or wide-character strings (the C wchar_t type). This is a delicate topic that is poorly understood by many programmers and not implemented in a consistent manner across languages. For those scripting languages that provide Unicode support, Unicode strings are often available in an 8-bit representation such as UTF-8 that can be mapped to the char * type (in which case the SWIG interface will probably work). If the program you are wrapping uses Unicode, there is no guarantee that Unicode characters in the target language will use the same internal representation (e.g., UCS-2 vs. UCS-4). You may need to write some special conversion functions.
results in a scripting language variable like this:%module example double foo;
Whenever the scripting language variable is used, the underlying C global variable is accessed. Although SWIG makes every attempt to make global variables work like scripting language variables, it is not always possible to do so. For instance, in Python, all global variables must be accessed through a special variable object known as cvar (shown above). In Ruby, variables are accessed as attributes of the module. Other languages may convert variables to a pair of accessor functions. For example, the Java module generates a pair of functions double get_foo() and set_foo(double val) that are used to manipulate the value.# Tcl set foo [3.5] ;# Set foo to 3.5 puts $foo ;# Print the value of foo # Python cvar.foo = 3.5 # Set foo to 3.5 print cvar.foo # Print value of foo # Perl $foo = 3.5; # Set foo to 3.5 print $foo,"\n"; # Print value of foo # Ruby Module.foo = 3.5 # Set foo to 3.5 print Module.foo, "\n" # Print value of foo
Finally, if a global variable has been declared as const, it only supports read-only access. Note: this behavior is new to SWIG-1.3. Earlier versions of SWIG incorrectly handled const and created constants instead.
In #define declarations, the type of a constant is inferred by syntax. For example, a number with a decimal point is assumed to be floating point. In addition, SWIG must be able to fully resolve all of the symbols used in a #define in order for a constant to actually be created. This restriction is necessary because #define is also used to define preprocessor macros that are definitely not meant to be part of the scripting language interface. For example:#define I_CONST 5 // An integer constant #define PI 3.14159 // A Floating point constant #define S_CONST "hello world" // A string constant #define NEWLINE '\n' // Character constant enum boolean {NO=0, YES=1}; enum months {JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC}; %constant double BLAH = 42.37; #define F_CONST (double) 5 // A floating pointer constant with cast #define PI_4 PI/4 #define FLAGS 0x04 | 0x08 | 0x40
In this case, you probably don't want to create a constant called EXTERN (what would the value be?). In general, SWIG will not create constants for macros unless the value can be completely determined by the preprocessor. For instance, in the above example, the declaration#define EXTERN extern EXTERN void foo();
defines a constant because PI was already defined as a constant and the value is known.#define PI_4 PI/4
The use of constant expressions is allowed, but SWIG does not evaluate them. Rather, it passes them through to the output file and lets the C compiler perform the final evaluation (SWIG does perform a limited form of type-checking however).
For enumerations, it is critical that the original enum definition be included somewhere in the interface file (either in a header file or in the %{,%} block). SWIG only translates the enumeration into code needed to add the constants to a scripting language. It needs the original enumeration declaration in order to get the correct enum values as assigned by the C compiler.
The %constant directive is used to more precisely create constants corresponding to different C datatypes. Although it is not usually not needed for simple values, it is more useful when working with pointers and other more complex datatypes. Typically, %constant is only used when you want to add constants to the scripting language interface that are not defined in the original header file.
Starting with SWIG-1.3, all variable declarations, regardless of any use of const, are wrapped as global variables. If a declaration happens to be declared as const, it is wrapped as a read-only variable. To tell if a variable is const or not, you need to look at the right-most occurrence of the const qualifier (that appears before the variable name). If the right-most const occurs after all other type modifiers (such as pointers), then the variable is const. Otherwise, it is not.
Here are some examples of const declarations.
Here is an example of a declaration that is not const:const char a; // A constant character char const b; // A constant character (the same) char *const c; // A constant pointer to a character const char *const d; // A constant pointer to a constant character
In this case, the pointer e can change---it's only the value being pointed to that is read-only.const char *e; // A pointer to a constant character. The pointer // may be modified.
Compatibility Note: One reason for changing SWIG to handle const declarations as read-only variables is that there are many situations where the value of a const variable might change. For example, a library might export a symbol as const in its public API to discourage modification, but still allow the value to change through some other kind of internal mechanism. Furthermore, programmers often overlook the fact that with a constant declaration like char *const, the underlying data being pointed to can be modified--it's only the pointer itself that is constant. In an embedded system, a const declaration might refer to a read-only memory address such as the location of a memory-mapped I/O device port (where the value changes, but writing to the port is not supported by the hardware). Rather than trying to build a bunch of special cases into the const qualifier, the new interpretation of const as "read-only" is simple and exactly matches the actual semantics of const in C/C++. If you really want to create a constant as in older versions of SWIG, use the %constant directive instead. For example:
or%constant double PI = 3.14159;
#ifdef SWIG #define const %constant #endif const double foo = 3.4; const double bar = 23.4; const int spam = 42; #ifdef SWIG #undef const #endif ...
int * double *** char **
are fully supported by SWIG. SWIG encodes pointers into a representation that contains the actual value of the pointer and a type-tag. Thus, the SWIG representation of the above pointers (in Tcl), might look like this:
_10081012_p_int _1008e124_ppp_double _f8ac_pp_char
A NULL pointer is represented by the string "NULL" or the value 0 encoded with type information.
All pointers are treated as opaque objects by SWIG. Thus, a pointer may be returned by a function and passed around to other C functions as needed. For all practical purposes, the scripting language interface works in exactly the same way as you would manipulate the pointer in a C program. The only difference is that there is no mechanism for dereferencing the pointer since this would require the target language to understand the memory layout of the underlying object.
The scripting language representation of a pointer value should never be manipulated directly. Even though the values shown above look like hexadecimal addresses, the numbers used may differ from the actual machine address (e.g., on little-endian machines, the digits may appear in reverse order). Furthermore, SWIG does not normally map pointers into high-level objects such as associative arrays or lists (for example, converting an int * into an list of integers). There are several reasons why SWIG does not do this:
Like C, void * matches any kind of pointer. Furthermore, NULL pointers can be passed to any function that expects to receive a pointer. Although this has the potential to cause a crash, NULL pointers are also sometimes used as sentinel values or to denote a missing/empty value. Therefore, SWIG leaves NULL pointer checking up to the application.
In other words, SWIG manipulates everything else by reference. This model makes sense because most C/C++ programs make heavy use of pointers and SWIG can use the type-checked pointer mechanism already present for handling pointers to basic datatypes.
Although this probably sounds complicated, it's really quite simple. Suppose you have an interface file like this :
In this file, SWIG doesn't know what a FILE is, but since it's used as a pointer, so it doesn't really matter what it is. If you wrapped this module into Python, you can use the functions just like you expect :%module fileio FILE *fopen(char *, char *); int fclose(FILE *); unsigned fread(void *ptr, unsigned size, unsigned nobj, FILE *); unsigned fwrite(void *ptr, unsigned size, unsigned nobj, FILE *); void *malloc(int nbytes); void free(void *);
In this case f1, f2, and buffer are all opaque objects containing C pointers. It doesn't matter what value they contain--our program works just fine without this knowledge.# Copy a file def filecopy(source,target): f1 = fopen(source,"r") f2 = fopen(target,"w") buffer = malloc(8192) nbytes = fread(buffer,8192,1,f1) while (nbytes > 0): fwrite(buffer,8192,1,f2) nbytes = fread(buffer,8192,1,f1) free(buffer)
void matrix_multiply(Matrix *a, Matrix *b, Matrix *c);
SWIG has no idea what a "Matrix" is. However, it is obviously a pointer to something so SWIG generates a wrapper using its generic pointer handling code.
Unlike C or C++, SWIG does not actually care whether Matrix has been previously defined in the interface file or not. This allows SWIG to generate interfaces from only partial or limited information. In some cases, you may not care what a Matrix really is as long as you can pass an opaque reference to one around in the scripting language interface.
An important detail to mention is that SWIG will gladly generate wrappers for an interface when there are unspecified type names. However, all unspecified types are internally handled as pointers to structures or classes! For example, consider the following declaration:
If size_t is undeclared, SWIG generates wrappers that expect to receive a type of size_t * (this mapping is described shortly). As a result, the scripting interface might behave strangely. For example:void foo(size_t num);
The only way to fix this problem is to make sure you properly declare type names using typedef.foo(40); TypeError: expected a _p_size_t.
typedef definitions appearing in a SWIG interface are not propagated to the generated wrapper code. Therefore, they either need to be defined in an included header file or placed in the declarations section like this:typedef unsigned int size_t;
or%{ /* Include in the generated wrapper file */ typedef unsigned int size_t; %} /* Tell SWIG about it */ typedef unsigned int size_t;
In certain cases, you might be able to include other header files to collect type information. For example:%inline %{ typedef unsigned int size_t; %}
In this case, you might run SWIG as follows:%module example %import "sys/types.h"
It should be noted that your mileage will vary greatly here. System headers are notoriously complicated and may rely upon a variety of non-standard C coding extensions (e.g., such as special directives to GCC). Unless you exactly specify the right include directories and preprocessor symbols, this may not work correctly (you will have to experiment).$ swig -I/usr/include -includeall example.i
SWIG tracks typedef declarations and uses this information for run-time type checking. For instance, if you use the above typedef and had the following function declaration:
The corresponding wrapper function will accept arguments of type unsigned int * or size_t *.void foo(unsigned int *ptr);
double dot_product(Vector a, Vector b);
To deal with this, SWIG transforms the function to use pointers by creating a wrapper equivalent to the following:
double wrap_dot_product(Vector *a, Vector *b) { return dot_product(*a,*b); }
In the target language, the dot_product() function now accepts pointers to Vectors instead of Vectors. For the most part, this transformation is transparent so you might not notice.
This function wants to return Vector, but SWIG only really supports pointers. As a result, SWIG creates a wrapper like this:Vector cross_product(Vector v1, Vector v2);
Vector *wrap_cross_product(Vector *v1, Vector *v2) { Vector *result; result = (Vector *) malloc(sizeof(Vector)); *(result) = cross(*v1,*v2); return result; }
or if SWIG was run with the -c++ option:
Vector *wrap_cross(Vector *v1, Vector *v2) { Vector *result = new Vector(cross(*v1,*v2)); // Uses default copy constructor return result; }
In both cases, SWIG allocates a new object and returns a reference to it. It is up to the user to delete the returned object when it is no longer in use. Clearly, this will leak memory if you are unaware of the implicit memory allocation and don't take steps to free the result. That said, it should be noted that some language modules can now automatically track newly created objects and reclaim memory for you. Consult the documentation for each language module for more details.
Vector unit_i;
gets mapped to an underlying pair of set/get functions like this :
Vector *unit_i_get() { return &unit_i; } void unit_i_set(Vector *value) { unit_i = *value; }
Again some caution is in order. A global variable created in this manner will show up as a pointer in the target scripting language. It would be an extremely bad idea to free or destroy such a pointer. Also, C++ classes must supply a properly defined copy constructor in order for assignment to work correctly.
SWIG generates the following code:char *foo;
If this is not the behavior that you want, consider making the variable read-only using the %readonly directive. Alternatively, you might write a short assist-function to set the value exactly like you want. For example:/* C mode */ void foo_set(char *value) { if (foo) free(foo); foo = (char *) malloc(strlen(value)+1); strcpy(foo,value); } /* C++ mode. When -c++ option is used */ void foo_set(char *value) { if (foo) delete [] foo; foo = new char[strlen(value)+1]; strcpy(foo,value); }
Note: If you write an assist function like this, you will have to call it as a function from the target scripting language (it does not work like a variable). For example, in Python you will have to write:%inline %{ void set_foo(char *value) { strncpy(foo,value, 50); } %}
A common mistake with char * variables is to link to a variable declared like this:>>> set_foo("Hello World")
In this case, the variable will be readable, but any attempt to change the value results in a segmentation or general protection fault. This is due to the fact that SWIG is trying to release the old value using free or delete when the string literal value currently assigned to the variable wasn't allocated using malloc() or new. To fix this behavior, you can either mark the variable as read-only, write a typemap (as described in Chapter 6), or write a special set function as shown. Another alternative is to declare the variable as an array:char *VERSION = "1.0";
char VERSION[64] = "1.0";
int foobar(int a[40]); void grok(char *argv[]); void transpose(double a[20][20]);
are processed as if they were really declared like this:
Like C, SWIG does not perform array bounds checking. It is up to the user to make sure the pointer points a suitably allocated region of memory.int foobar(int *a); void grok(char **argv); void transpose(double (*a)[20]);
Multi-dimensional arrays are transformed into a pointer to an array of one less dimension. For example:
It is important to note that in the C type system, a multidimensional array a[][] is NOT equivalent to a single pointer *a or a double pointer such as **a. Instead, a pointer to an array is used (as shown above) where the actual value of the pointer is the starting memory location of the array. The reader is strongly advised to dust off their C book and re-read the section on arrays before using them with SWIG.int [10]; // Maps to int * int [10][20]; // Maps to int (*)[20] int [10][20][30]; // Maps to int (*)[20][30]
Array variables are supported, but are read-only by default. For example:
In this case, reading the variable 'a' returns a pointer of type int (*)[200] that points to the first element of the array &a[0][0]. Trying to modify 'a' results in an error. This is because SWIG does not know how to copy data from the target language into the array. To work around this limitation, you may want to write a few simple assist functions like this:int a[100][200];
To dynamically create arrays of various sizes and shapes, it may be useful to write some helper functions in your interface. For example:%inline %{ void a_set(int i, int j, int val) { a[i][j] = val; } int a_get(int i, int j) { return a[i][j]; } %}
Arrays of char are handled as a special case by SWIG. In this case, strings in the target language can be stored in the array. For example, if you have a declaration like this,// Some array helpers %inline %{ /* Create any sort of [size] array */ int *int_array(int size) { return (int *) malloc(size*sizeof(int)); } /* Create a two-dimension array [size][10] */ int (*int_array_10(int size))[10] { return (int (*)[10]) malloc(size*10*sizeof(int)); } %}
SWIG generates functions for both getting and setting the value that are equivalent to the following code:char pathname[256];
In the target language, the value can be set like a normal variable.char *pathname_get() { return pathname; } void pathname_set(char *value) { strncpy(pathname,value,256); }
// File : interface.i int a; // Can read/write %readonly int b,c,d // Read only variables %readwrite double x,y // read/write
The %readonly directive enables read-only mode until it is explicitly disabled using the %readwrite directive.
Read-only variables are also created when declarations are declared as const. For example:
const int foo; /* Read only variable */ char * const version="1.0"; /* Read only variable */
SWIG still calls the correct C function, but in this case the function print() will really be called "my_print()" in the target language.// interface.i %rename(my_print) print; extern void print(char *); %rename(foo) a_really_long_and_annoying_name; extern int a_really_long_and_annoying_name;
The placement of the %rename directive is arbitrary as long as it appears before the declarations to be renamed. A common technique is to write code for wrapping a header file like this:
// interface.i %rename(my_print) print; %rename(foo) a_really_long_and_annoying_name; %include "header.h"
%rename applies a renaming operation to all future occurrences of a name. The renaming applies to functions, variables, class and structure names, member functions, and member data. For example, if you had two-dozen C++ classes, all with a member function named `print' (which is a keyword in Python), you could rename them all to `output' by specifying :
%rename(output) print; // Rename all `print' functions to `output'
SWIG does not normally perform any checks to see if the functions it wraps are already defined in the target scripting language. However, if you are careful about namespaces and your use of modules, you can usually avoid these problems.
Closely related to %rename is the %ignore directive. %ignore instructs SWIG to ignore declarations that match a given identifier. For example:
One use of %ignore is to selectively remove certain declarations from a header file without having to add conditional compilation to the header. However, it should be stressed that this only works for simple declarations. If you need to remove a whole section of problematic code, the SWIG preprocessor should be used instead.%ignore print; // Ignore all declarations named print %ignore _HAVE_FOO_H; // Ignore an include guard constant ... %include "foo.h" // Grab a header file ...
More powerful variants of %rename and %ignore directives can be used to help wrap C++ overloaded functions and methods. This is described a little later in the C++ section.
Compatibility note: Older versions of SWIG provided a special %name directive for renaming declarations. For example:
This directive is still supported, but it is deprecated and should probably be avoided. The %rename directive is more powerful and better supports wrapping of raw header file information.%name(output) extern void print(char *);
In this case, SWIG generates wrapper code where the default arguments are optional in the target language. For example, this function could be used in Tcl as follows :int plot(double x, double y, int color=WHITE);
Although the ANSI C standard does not allow default arguments, default arguments specified in a SWIG interface work with both C and C++.% plot -3.4 7.5 # Use default value % plot -3.4 7.5 10 # set color to 10 instead
int binary_op(int a, int b, int (*op)(int,int));
When you first wrap something like this into an extension module, you may find the function to be impossible to use. For instance, in Python:
The reason for this error is that SWIG doesn't know how to map a scripting language function into a C callback. However, existing C functions can be used as arguments provided you install them as constants. One way to do this is to use the %constant directive like this:>>> def add(x,y): ... return x+y ... >>> binary_op(3,4,add) Traceback (most recent call last): File "", line 1, in ? TypeError: Type error. Expected _p_f_int_int__int >>>
In this case, add, sub, and mul become function pointer constants in the target scripting language. This allows you to use them as follows:/* Function with a callback */ int binary_op(int a, int b, int (*op)(int,int)); /* Some callback functions */ %constant int add(int,int); %constant int sub(int,int); %constant int mul(int,int);
Unfortunately, by declaring the callback functions as constants, they are no longer accesible as functions. For example:>>> binary_op(3,4,add) 7 >>> binary_op(3,4,mul) 12 >>>
If you want to make a function available as both a callback function and a function, you can use the %callback and %nocallback directives like this:>>> add(3,4) Traceback (most recent call last): File "", line 1, in ? TypeError: object is not callable: '_ff020efc_p_f_int_int__int' >>>
The argument to %callback is a printf-style format string that specifies the naming convention for the callback constants (%s gets replaced by the function name). The callback mode remains in effect until it is explicitly disabled using %nocallback. When you do this, the interface now works as follows:/* Function with a callback */ int binary_op(int a, int b, int (*op)(int,int)); /* Some callback functions */ %callback("%s_cb") int add(int,int); int sub(int,int); int mul(int,int); %nocallback
Notice that when the function is used as a callback, special names such as add_cb is used instead. To call the function normally, just use the original function name such as add().>>> binary_op(3,4,add_cb) 7 >>> binary_op(3,4,mul_cb) 12 >>> add(3,4) 7 >>> mul(3,4) 12
SWIG provides a number of extensions to standard C printf formatting that may be useful in this context. For instance, the following variation installs the callbacks as all upper-case constants such as ADD, SUB, and MUL:
A format string of "%(lower)s" converts all characters to lower-case. A string of "%(title)s" capitalizes the first character and converts the rest to lower case./* Some callback functions */ %callback("%(upper)s") int add(int,int); int sub(int,int); int mul(int,int); %nocallback
And now, a final note about function pointer support. Although SWIG does not normally allow callback functions to be written in the target language, this can be accomplished with the use of typemaps and other advanced SWIG features. This is described in a later chapter.
If SWIG encounters the definition of a structure or union, it creates a set of accessor functions. Although SWIG does not need structure definitions to build an interface, providing definitions make it possible to access structure members. The accessor functions generated by SWIG simply take a pointer to an object and allow access to an individual member. For example, the declaration :
gets transformed into the following set of accessor functions :struct Vector { double x,y,z; }
In addition, SWIG creates default constructor and destructor functions if none are defined in the interface. For example:double Vector_x_get(struct Vector *obj) { return obj->x; } double Vector_y_get(struct Vector *obj) { return obj->y; } double Vector_z_get(struct Vector *obj) { return obj->z; } void Vector_x_set(struct Vector *obj, double value) { obj->x = value; } void Vector_y_set(struct Vector *obj, double value) { obj->y = value; } void Vector_z_set(struct Vector *obj, double value) { obj->z = value; }
Using these low-level accessor functions, an object can be minimally manipulated from the target language using code like this:struct Vector *new_Vector() { return (Vector *) calloc(1,sizeof(struct Vector)); } void delete_Vector(struct Vector *obj) { free(obj); }
However, most of SWIG's language modules also provide a high-level interface that is more convenient. Keep reading.v = new_Vector() Vector_x_set(v,2) Vector_y_set(v,10) Vector_z_set(v,-5) ... delete_Vector(v)
When encountered, SWIG assumes that the name of the object is `Vector' and creates accessor functions like before. The only difference is that the use of typedef allows SWIG to drop the struct keyword on its generated code. For example:typedef struct { double x,y,z; } Vector;
If two different names are used like this :double Vector_x_get(Vector *obj) { return obj->x; }
the name Vector is used instead of vector_struct since this is more typical C programming style. If declarations defined later in the interface use the type struct vector_struct, SWIG knows that this is the same as Vector and it generates the appropriate type-checking code.typedef struct vector_struct { double x,y,z; } Vector;
This results in the following accessor functions :%module mymodule ... struct Foo { char *name; ... }
char *Foo_name_get(Foo *obj) { return Foo->name; } char *Foo_name_set(Foo *obj, char *c) { if (obj->name) free(obj->name); obj->name = (char *) malloc(strlen(c)+1); strcpy(obj->name,c); return obj->name; }
If this behavior differs from what you need in your applications, the SWIG "memberin" typemap can be used to change it. See the typemaps chapter for further details.
Note: If the -c++ option is used, new and delete are used to perform memory allocation.
To eliminate the warning message, typemaps can be used, but this is discussed in a later chapter. In many cases, the warning message is harmless.interface.i:116. Warning. Array member will be read-only
If you don't want SWIG to generate constructors and destructors, you can use the %nodefault directive or the -no_default command line option. For example:
orswig -no_default example.i
Compatibility note: Prior to SWIG-1.3.7, SWIG did not generate default constructors or destructors unless you explicitly turned them on using -make_default. However, it appears that most users want to have constructor and destructor functions so it has now been enabled as the default behavior.%module foo ... %nodefault // Don't create default constructors/destructors ... declarations ... %makedefault // Reenable default constructors/destructors
You can make a Vector look alot like a class by writing a SWIG interface like this:/* file : vector.h */ ... typedef struct { double x,y,z; } Vector;
Now, when used with shadow classes in Python, you can do things like this :// file : vector.i %module mymodule %{ #include "vector.h" %} %include vector.h // Just grab original C header file %addmethods Vector { // Attach these functions to struct Vector Vector(double x, double y, double z) { Vector *v; v = (Vector *v) malloc(sizeof(Vector)); v->x = x; v->y = y; v->z = z; return v; } ~Vector() { free(self); } double magnitude() { return sqrt(self->x*self->x+self->y*self->y+self->z*self->z); } void print() { printf("Vector [%g, %g, %g]\n", self->x,self->y,self->z); } };
>>> v = Vector(3,4,0) # Create a new vector >>> print v.magnitude() # Print magnitude 5.0 >>> v.print() # Print it out [ 3, 4, 0 ] >>> del v # Destroy it
The %addmethods directive can also be used inside the definition of the Vector structure. For example:
// file : vector.i %module mymodule %{ #include "vector.h" %} typedef struct { double x,y,z; %addmethods { Vector(double x, double y, double z) { ... } ~Vector() { ... } ... } } Vector;
Finally, %addmethods can be used to access externally written functions provided they follow the naming convention used in this example :
A little known feature of the %addmethods directive is that it can also be used to add synthesized attributes or to modify the behavior of existing data attributes. For example, suppose you wanted to make magnitude a read-only attribute of Vector instead of a method. To do this, you might write some code like this:/* File : vector.c */ /* Vector methods */ #include "vector.h" Vector *new_Vector(double x, double y, double z) { Vector *v; v = (Vector *) malloc(sizeof(Vector)); v->x = x; v->y = y; v->z = z; return v; } void delete_Vector(Vector *v) { free(v); } double Vector_magnitude(Vector *v) { return sqrt(v->x*v->x+v->y*v->y+v->z*v->z); } // File : vector.i // Interface file %module mymodule %{ #include "vector.h" %} typedef struct { double x,y,z; %addmethods { Vector(int,int,int); // This calls new_Vector() ~Vector(); // This calls delete_Vector() double magnitude(); // This will call Vector_magnitude() ... } } Vector;
Now, for all practial purposes, magnitude will appear like an attribute of the object.// Add a new attribute to Vector %addmethods Vector { const double magnitude; } // Now supply the implementation of the Vector_magnitude_get function %{ const double Vector_magnitude_get(Vector *v) { return (const double) return sqrt(v->x*v->x+v->y*v->y+v->z*v->z); } %}
A similar technique can also be used to work with problematic data members. For example, consider this interface:
By default, the name attribute is read-only because SWIG does not normally know how to modify arrays. However, you can rewrite the interface as follows to change this:struct Person { char name[50]; ... }
Finally, it should be stressed that even though %addmethods can be used to add new data members, these new members can not require the allocation of additional storage in the object (e.g., their values must be entirely synthesized from existing attributes of the structure).struct Person { %addmethods { char *name; } ... } // Specific implementation of set/get functions %{ char *Person_name_get(Person *p) { return p->name; } void Person_name_set(Person *p, char *val) { strncpy(p->name,val,50); } %}
When SWIG encounters this, it performs a structure splitting operation that transforms the declaration into the equivalent of the following:typedef struct Object { int objtype; union { int ivalue; double dvalue; char *strvalue; void *ptrvalue; } intRep; } Object;
SWIG will then create an Object_intRep structure for use inside the interface file. Accessor functions will be created for both structures. In this case, functions like this would be created :typedef union { int ivalue; double dvalue; char *strvalue; void *ptrvalue; } Object_intRep; typedef struct Object { int objType; Object_intRep intRep; } Object;
Although this process is a little hairy, it works like you would expect in the target scripting language--especially when shadow classes are used. For instance, in Perl:Object_intRep *Object_intRep_get(Object *o) { return (Object_intRep *) &o->intRep; } int Object_intRep_ivalue_get(Object_intRep *o) { return o->ivalue; } int Object_intRep_ivalue_set(Object_intRep *o, int value) { return (o->ivalue = value); } double Object_intRep_dvalue_get(Object_intRep *o) { return o->dvalue; } ... etc ...
# Perl5 script for accessing nested member $o = CreateObject(); # Create an object somehow $o->{intRep}->{ivalue} = 7 # Change value of o.intRep.ivalue
If you have a lot nested structure declarations, it is advisable to double-check them after running SWIG. Although, there is a good chance that they will work, you may have to modify the interface file in certain cases.
Starting with SWIG1.3, a number of improvements have been made to SWIG's code generator. Specifically, even though structure access has been described in terms of high-level accessor functions such as this,
most of the generated code is actually inlined directly into wrapper functions. Therefore, no function Vector_x_get() actually exists in the generated wrapper file. For example, when creating a Tcl module, the following function is generated instead:double Vector_x_get(Vector *v) { return v->x; }
The only exception to this rule are methods defined with %addmethods. In this case, the added code is contained in a separate function.static int _wrap_Vector_x_get(ClientData clientData, Tcl_Interp *interp, int objc, Tcl_Obj *CONST objv[]) { struct Vector *arg1 ; double result ; if (SWIG_GetArgs(interp, objc, objv,"p:Vector_x_get self ",&arg0, SWIGTYPE_p_Vector) == TCL_ERROR) return TCL_ERROR; result = (double ) (arg1->x); Tcl_SetObjResult(interp,Tcl_NewDoubleObj((double) result)); return TCL_OK; }
Finally, it is important to note that most language modules may choose to build a more advanced interface. Although you may never use the low-level interface described here, most of SWIG's language modules use it in some way or another.
This section describes SWIG's low-level access to C++ declarations. In many instances, this low-level interface may be hidden by shadow classes or an alternative calling mechanism (this is usually language dependent and is described in detail in later chapters).
The following C++ features are not currently supported :
SWIG's C++ support has gradually been improved over the years so some of these limitations may be lifted in a future release. However, we make no promises.
%module list %{ #include "list.h" %} // Very simple C++ example for linked list class List { public: List(); ~List(); int search(char *value); void insert(char *); void remove(char *); char *get(int n); int length; static void print(List *l); };
When compiling C++ code, it is critical that SWIG be called with the `-c++' option. This changes the way a number of critical features such as memory management are handled. It also enables the recognition of C++ keywords. Without the -c++ flag, SWIG will either issue a warning or a large number of syntax errors if it encounters C++ code in an interface file.
If a C++ class does not define any public constructors or destructors, SWIG will automatically create a default constructor or destructor. However, there are a few rules that define this behavior:List * new_List(void) { return new List; } void delete_List(List *l) { delete l; }
Compatibility Note: The generation of default constructors/destructors was made the default behavior in SWIG 1.3.7. This may break certain older modules, but the old behavior can be easily restored using %nodefault or the -nodefault command line option. Furthermore, in order for SWIG to properly generate (or not generate) default constructors, it must be able to gather information from both the private and protected sections (specifically, it needs to know if a private or protected constructor/destructor is defined). In older versions of SWIG, it was fairly common to simply remove or comment out the private and protected sections of a class due to parsing limitations. However, this removal may now cause SWIG to erroneously generate constructors for classes that define a constructor in those sections. Consider restoring those sections in the interface or using %nodefault to fix the problem.%nodefault; // Disable creation of constructor/destructor class Foo { ... }; %makedefault;
This translation is the same even if the member function has been declared as virtual.int List_search(List *obj, char *value) { return obj->search(value); }
It should be noted that SWIG does not actually create a C accessor function in the code it generates. Instead, member access such as obj->search(value) is directly inlined into the generated wrapper functions. However, the name and calling convention of the wrappers match the accessor function prototype described above.
Usually, static members are accessed as functions with names in which the class name has been prepended with an underscore. For example, List_print.
A read-only member can be created using the %readonly and %readwrite directives. For example, we probably wouldn't want the user to change the length of a list so we could do the following to make the value available, but read-only.int List_length_get(List *obj) { return obj->length; } int List_length_set(List *obj, int value) { obj->length = value; return value; }
Similarly, all data attributes declared as const are wrapped as read-only members.class List { public: ... %readonly int length; %readwrite ... };
By default, members of a class definition are assumed to be private until you explicitly give a `public:' declaration (This is the same convention used by C++).
Generates the following set of constants in the target scripting language :class Swig { public: enum {ALE, LAGER, PORTER, STOUT}; };
Members declared as const are wrapped as read-only members and do not create constants.Swig_ALE = Swig::ALE Swig_LAGER = Swig::LAGER Swig_PORTER = Swig::PORTER Swig_STOUT = Swig::STOUT
class Foo { public: double bar(double &a); }
is accessed using a function similar to this:
Functions that return a reference are remapped to return a pointer instead. For example:double Foo_bar(Foo *obj, double *a) { obj->bar(*a); }
Generates code like this:class Bar { public: double &spam(); };
Don't return references to objects allocated as local variables on the stack. SWIG doesn't make a copy of the objects so this will probably cause your program to crash.double *Bar_spam(Bar *obj) { double &result = obj->spam(); return &result; }
SWIG does not support private or protected inheritance (it is parsed, but it has no effect on the generated code). Note: private and protected inheritance do not define an "isa" relationship between classes so it would have no effect on type-checking anyways.
The following example shows how SWIG handles inheritance. For clarity, the full C++ code has been omitted.
// shapes.i %module shapes %{ #include "shapes.h" %} class Shape { public: double x,y; virtual double area() = 0; virtual double perimeter() = 0; void set_location(double x, double y); }; class Circle : public Shape { public: Circle(double radius); ~Circle(); double area(); double perimeter(); }; class Square : public Shape { public: Square(double size); ~Square(); double area(); double perimeter(); }
When wrapped into Python, we can now perform the following operations :
In this example, Circle and Square objects have been created. Member functions can be invoked on each object by making calls to Circle_area, Square_area, and so on. However, the same results can be accomplished by simply using the Shape_area function on either object.$ python >>> import shapes >>> circle = shapes.new_Circle(7) >>> square = shapes.new_Square(10) >>> print shapes.Circle_area(circle) 153.93804004599999757 >>> print shapes.Shape_area(circle) 153.93804004599999757 >>> print shapes.Shape_area(square) 100.00000000000000000 >>> shapes.Shape_set_location(square,2,-3) >>> print shapes.Shape_perimeter(square) 40.00000000000000000 >>>
One important point concerning inheritance is that the low-level accessor functions are only generated for classes in which they are actually declared. For instance, in the above example, the method set_location() is only accessible as Shape_set_location() and not as Circle_set_location() or Square_set_location(). Of course, the Shape_set_location() function will accept any kind of object derived from Shape. Similarly, accessor functions for the attributes x and y are generated as Shape_x_get(), Shape_x_set(), Shape_y_get(), and Shape_y_set(). Functions such as Circle_x_get() are not available--instead you should use Shape_x_get().
Although the low-level C-like interface is functional, most language modules also produce a higher level OO interface using a technique known as shadow classing. This approach is described shortly and can be used to provide a more natural C++ interface.
Compatibility Note: Starting in version 1.3.7, SWIG only generates low-level accessor wrappers for the declarations that are actually defined in each class. This differs from SWIG1.1 which used to inherit all of the declarations defined in base classes and regenerate specialized accessor functions such as Circle_x_get(), Square_x_get(), Circle_set_location(), and Square_set_location(). This old behavior results in huge amounts of replicated code for large class hierarchies and makes it awkward to build applications spread across multiple modules (since accessor functions are duplicated in every single module). It is also unnecessary to have such wrappers when advanced features like shadow-classing are used. Future versions of SWIG may apply further optimizations such as not regenerating wrapper functions for virtual members that are already defined in a base class.
This will create the functions List_find, List_delete, and a function named new_ListSize for the overloaded constructor.class List { public: List(); %name(ListSize) List(int maxsize); ~List(); int search(char *value); %name(find) void insert(char *); %name(delete) void remove(char *); char *get(int n); int length; static void print(List *l); };
The %name directive can be applied to all members including constructors, destructors, static functions, data members, and enumeration values.
The class name prefix can also be changed by specifying
Although the %name() directive can be used to help deal with overloaded methods, it really doesn't work very well because it requires a lot of additional markup in your interface. Keep reading for a better solution.%name(newname) class List { ... }
In C++, functions and methods can be overloaded by declaring them with different type signatures. For example:
Later, when a call to function foo() is made, the determination of which function to invoke is made by looking at the types of the arguments. For example:void foo(int); void foo(double); void foo(Bar *b, Spam *s, int );
It is important to note that the selection of the overloaded method or function is made by the C++ compiler and occurs at compile time. It does not occur as your program runs.int x; double y; Bar *b; Spam *s; int z; ... foo(x); // Calls foo(int) foo(y); // Calls foo(double) foo(b,s,z); // Calls foo(Bar *, Spam *, int)
Internal to the C++ compiler, overloaded functions are mapped to unique identifiers using a name-mangling technique where the arguments are used to create a unique type signature that is appended to the name. This produces three unique function names that might look like this:
Calls to foo() are then mapped to an appropriate version depending on the types of arguments passed.void foo__Fi(int); void foo__Fd(double); void foo__FP3BarP4Spami(Bar *, Spam *, int);
The implementation of overloaded methods in C++ is difficult to translate directly to a scripting language environment because it relies on static type-checking and compile-time binding of methods--neither of which map to the dynamic environment of an interpreter. For example, in Python, Perl, and Tcl, it is simply impossible to define three entirely different versions of a function with exactly the same name within the same scope. The repeated definitions simply replace previous definitions.
Therefore, to solve the overloading problem, let's first look at several approaches that have been proposed as solutions, but which are NOT used to solve the overloading problem in SWIG.
Although this certainly works, it is extremely annoying to explicitly annotate every class with a bunch of %name directives like that. It fact, it's so annoying that this really isn't a viable solution at all (except in cases where there is very little overloading). Dave sincerely apologizes for ever thinking that this approach was good enough--however, let's try to forget the past and move on.void foo(int); %name(foo_d) foo(double); %name(foo_barspam) foo(Bar *, Spam *, int);
Needless to say, this approach is not used by SWIG nor has it ever been seriously considered.foo__FP3BarP4Spami(b,s,i);
Although a lot more readable than the fully mangled version, this now has the problem of naming clashes. For instance, what is supposed to happen with these two functions?void foo(int); // becomes foo_i(int) void foo(double); // becomes foo_d(int) void foo(Bar *, Spam *, int); // becomes foo_BSi(int)
Also, what happens if the mangled version happens to match a legitimate identifier name used elsewhere in the program? One could use the %name directive to resolve such a conflict, but this tends to defeat the whole point. Although this might work in simple cases, there are still a number of obvious problems.void foo(int i); // ????? void foo(instance *obj); // ?????
Unfortunately, the numbering doesn't give any clues about what the actual function is. Also, if the order changes or a new function is added, all of the numbers might change--breaking all of the programs written against the interface. There is also a tiny problem of naming methods with multiple inheritance:void foo(int); // becomes foo_1(int) void foo(double); // becomes foo_2(int) void foo(Bar *, Spam *, int); // becomes foo_3(int)
In this case, the member functions have different names in the base class than they do in a derived class! Clearly this is just bizarre and not particularly obvious to someone who has to maintain the resulting code. Again, this doesn't seem to be a viable solution except in very simple cases.class X { public: virtual void foo(int); // X_foo_1 virtual void foo(double); // X_foo_2 }; class Y { public: virtual void foo(long); // Y_foo_1 virtual void foo(Bar *, Spam *, int); // Y_foo_2 }; class Z : public X, public Y { public: virtual void foo(double); // Z_foo_1 ??? Mismatch X_foo_2 virtual void foo(Bar *, Spam *, int); // Z_foo_2 ??? // What happens to X_foo_1 and Y_foo_1 here? };
Unfortunately there are serious problems with this approach as well. First, the addition of dynamic dispatch code introduces a performance hit on the execution time of overloaded methods since the arguments to each method call have to first be examined to figure out which function to dispatch. Although the sample code above doesn't look too bad, this procedure may involve interaction with the SWIG type-checker, typemaps (a SWIG customization scheme), and other more advanced parts of the interpreter. A more nasty problem has to do with functions that can accept the same type of scripting object. For example, if you have this,wrap_foo(args): if len(args) == 3: if (args[0].type == Bar and args[1].type == Spam and args[2].type == int): foo((Bar *) arg[0], (Spam *) arg[1], (int) arg[2]) else if len(args) == 1: if args[0].type == int: foo((int) args[0]) else if args[0].type == double: foo((double) args[0]) else: raise "Bad arguments to foo"
the foo(double) function will probably accept both a scripting language integer and a floating point number as an argument. As a result, it's possible for the foo(double) function to hide the integer function foo(int) if arguments aren't checked in the correct order. For instance, if you switch the order of the two functions in the interface file, does foo(int) suddenly become unavailable? To deal with this problem, you might decide to make all of the overloaded functions additionally available through name mangling. However, that now introduces all of the problems of name mangling plus all of the problems of dynamic dispatch!void foo(int); void foo(double);
The bottom line is that even though some kind of dynamic dispatch scheme may be the "best" way to support overloading, it is difficult to implement and it has some serious shortcomings including performance, hiding of functions, and possibly poor interaction with some of SWIG's customization features.
Next, a top-level wrapper could be written like this:wrap_foo_i(args) { ... foo((int) arg[0]); ... } wrap_foo_d(args) { ... foo((double) arg[0]); ... } wrap_foo_BSi(args) { ... foo((Bar *) arg[0], (Spam *) arg[1], (int) arg[2]); ... }
Like dynamic dispatch, this solution suffers from a performance penalty from trying to start the execution of each possible function. In fact, the impact may be worse since the only way to determine the proper function is to try all possibilities until no errors occur (dynamic dispatch could make more intelligent choices). Another problem is that a function might throw an ERROR for a different reason than improper arguments (maybe the arguments were okay, but something happened during execution). Therefore, you would need to have some kind of special error condition to indicate an error in argument conversion. A more subtle problem arises with languages such as Ruby and Perl that handle errors by executing a longjmp() to return control back to the interpreter (in which case, the above approach won't work like we want). Finally, making this approach work with inheritance and all of SWIG's customization options is also problematic.wrap_foo(args) { if (wrap_foo_i(args) == SUCCESS) return SUCCESS; if (wrap_foo_d(args) == SUCCESS) return SUCCESS; if (wrap_foo_BSi(args) == SUCCESS) return SUCCESS; return ERROR, "No matching function foo"; }
Of all of the schemes mentioned, trial execution is the most likely feature that might be added to SWIG in the future. However, no such support is planned at this time.
Although it would be nice to support an advanced wrapping technique such as dynamic dispatch or trial execution, both of these techniques are difficult (if not impossible) to implement in a completely general manner that would work in all situations and with all combinations of SWIG customization features. Therefore, rather than generate wrappers that only work some of the time, SWIG takes a slightly different approach.
Starting with SWIG-1.3.7, a very simple enhancement has been added to the %rename directive to help disambiguate overloaded functions and methods. Normally, the %rename directive is used to rename a declaration everywhere in an interface file. For example, if you write this,
all occurences of "bar" will be renamed to "foo" (this feature was described a little earlier in this chapter in the section "Renaming Declarations"). By itself, this doesn't do anything to help fix overloaded methods. However, the %rename directive can now be parameterized as shown in this example:%rename(foo) bar;
Since, the %rename declaration is used to declare a renaming in advance, it can be placed at the start of an interface file. This makes it possible to apply a consistent name resolution without having to modify header files. For example:/* Forward renaming declarations */ %rename(foo_i) foo(int); %rename(foo_d) foo(double); ... void foo(int); // Becomes 'foo_i' void foo(char *c); // Stays 'foo' (not renamed) class Spam { public: void foo(int); // Becomes 'foo_i' void foo(double); // Becomes 'foo_d' ... };
When used in this simple form, the renaming is applied to all global functions and member functions that match the prototype. If you only want the renaming to apply to a certain scope, the C++ scope resolution operator (::) can be used. For example:%module foo /* Rename these overloaded functions */ %rename(foo_i) foo(int); %rename(foo_d) foo(double); %include "header.h"
When a renaming operator is applied to a class as in Spam::foo(int), it is applied to that class and all derived classes. This can be used to apply a consistent renaming across an entire class hierarchy with only a few declarations. For example:%rename(foo_i) ::foo(int); // Only rename foo(int) in the global scope. // (will not rename class members) %rename(foo_i) Spam::foo(int); // Only rename foo(int) in class Spam
Depending on your application, it may make more sense to include %rename specifications in the class definition. For example:%rename(foo_i) Spam::foo(int); %rename(foo_d) Spam::foo(double); class Spam { public: virtual void foo(int); // Renamed to foo_i virtual void foo(double); // Renamed to foo_d ... }; class Bar : public Spam { public: virtual void foo(int); // Renamed to foo_i virtual void foo(double); // Renamed to foo_d ... }; class Grok : public Bar { public: virtual void foo(int); // Renamed to foo_i virtual void foo(double); // Renamed to foo_d ... };
In this case, the %rename directives still get applied across the entire inheritance hierarchy, but it's no longer necessary to explicitly specify the class prefix Spam::.class Spam { %rename(foo_i) foo(int); %rename(foo_d) foo(double); public: virtual void foo(int); // Renamed to foo_i virtual void foo(double); // Renamed to foo_d ... }; class Bar : public Spam { public: virtual void foo(int); // Renamed to foo_i virtual void foo(double); // Renamed to foo_d ... };
A special form of %rename can be used to apply a renaming just to class members (of all classes):
Note: the *:: syntax is non-standard C++, but the '*' is meant to be a wildcard that matches any class name (we couldn't think of a better alternative so if you have a better idea, send email to swig-dev@cs.uchicago.edu).%rename(foo_i) *::foo(int); // Only rename foo(int) if it appears in a class.
Although the %rename approach does not automatically solve the overloading problem for you (you have to supply a name), SWIG's error messages have been improved to help. For example, consider this interface file:
If you run SWIG on this file, you will get the following error messages:%module foo class Spam { public: void foo(int); void foo(double); void foo(Bar *, Spam *, int); };
The error messages indicate the problematic functions along with their type signature. In addition, the previous definition is supplied. Therefore, you can just look at these errors and decide how you want to handle the overloaded functions. For example:foo.i:6. Overloaded declaration ignored. Spam::foo(double ) foo.i:5. Previous declaration is Spam::foo(int ) foo.i:7. Overloaded declaration ignored. Spam::foo(Bar *,Spam *,int ) foo.i:5. Previous declaration is Spam::foo(int )
And again, for a class hierarchy, you may be able to solve all of the problems by just renaming members in the base class--those renamings automatically propagate to all derived classes.%module foo %rename(foo_d) Spam::foo(double); // name foo_d %rename(foo_barspam) Spam::foo(Bar *, Spam *, int); // name foo_barspam ... class Spam { ... };
Another way to resolve overloaded methods is to simply eliminate conflicting definitions. An easy way to do this is to use the %ignore directive. %ignore works exactly like %rename except that it forces a declaration to disappear. For example:
When applied to a base class, %ignore forces all definitions in derived clases to disappear. For example, %ignore Spam::foo(double) will eliminate foo(double) in Spam and all classes derived from Spam.%ignore foo(double); // Ignore all foo(double) %ignore Spam::foo; // Ignore foo in class Spam %ignore Spam::foo(double); // Ignore foo(double) in class Spam %ignore *::foo(double); // Ignore foo(double) in all classes
A few implementation notes about the enhanced %rename directive and %ignore:
%rename(bar) ::foo; // Rename foo to bar in global scope only %rename(bar) Spam::foo; // Rename foo to bar in class Spam only %rename(bar) *::foo; // Rename foo in classes only
and this%rename(bar) foo; %rename(foo_i) Spam::foo(int); %rename(Foo) Spam::foo;
(the declarations are not stored in a linked list and order has no importance). Of course, a repeated %rename directive will change the setting for a previous %rename directive if exactly the same name, scope, and parameters are supplied.%rename(Foo) Spam::foo; %rename(bar) foo; %rename(foo_i) Spam::foo(int);
the declaration %rename(name) Foo::bar() applies to the qualified member bar() const. However, an often overlooked C++ feature is that classes can define two different overloaded members that differ only in their qualifiers, like this:class Foo { public: ... void bar() const; ... };
Even when renaming is used, this still generates an error (both bar() methods will be renamed to the same thing). However, if you want to silence the errors, %rename and %ignore can be further specialized with qualifiers. For example, the following directive would tell SWIG to ignore the const version of bar() above:class Foo { public: ... void bar(); // Unqualified member void bar() const; // Qualified member (OK) ... };
%ignore Foo::bar() const; // Ignore bar() const, but leave other bar() alone
When operator declarations appear, they are handled in exactly the same manner as regular methods. However, the names of these methods are set to strings like "operator +" or "operator -". The problem with these names is that they are illegal identifiers in most scripting languages. For instance, you can't just create a method called "operator +" in Python--there won't be any way to call it.class Complex { private: double rpart, ipart; public: Complex(double r = 0, double i = 0) : rpart(r), ipart(i) { } Complex(const Complex &c) : rpart(c.rpart), ipart(c.ipart) { } Complex &operator=(const Complex &c) { rpart = c.rpart; ipart = c.ipart; return *this; } Complex operator+(const Complex &c) const { return Complex(rpart+c.rpart, ipart+c.ipart); } Complex operator-(const Complex &c) const { return Complex(rpart-c.rpart, ipart-c.ipart); } Complex operator*(const Complex &c) const { return Complex(rpart*c.rpart - ipart*c.ipart, rpart*c.ipart + c.rpart*ipart); } Complex operator-() const { return Complex(-rpart, -ipart); } double re() const { return rpart; } double im() const { return ipart; } };
Some language modules already know how to automatically handle certain operators (mapping them into operators in the target language). However, the underlying implementation of this is really managed in a very general way using the %rename directive. For example, in Python a declaration similar to this is used:
This binds the + operator to a method called __add__ (which is conveniently the same name used to implement the Python + operator). Internally, the generated wrapper code for a wrapped operator will look something like this pseudocode:%rename(__add__) Complex::operator+;
When used in the target language, it may now be possible to use the overloaded operator normally. For example:_wrap_Complex___add__(args) { ... get args ... obj->operator+(args); ... }
It is important to realize that there is nothing magical happening here. The %rename directive really only picks a valid method name. If you wrote this:>>> a = Complex(3,4) >>> b = Complex(5,2) >>> c = a + b # Invokes __add__ method
The resulting scripting interface might work like this:%rename(add) operator+;
All of the techniques described to deal with overloaded functions also apply to operators. For example:a = Complex(3,4) b = Complex(5,2) c = a.add(b) # Call a.operator+(b)
The last part of this example illustrates how multiple definitions of the operator- method might be handled.%ignore Complex::operator=; // Ignore = in class Complex %ignore *::operator=; // Ignore = in all classes %ignore operator=; // Ignore = everywhere. %rename(__sub__) Complex::operator-; %rename(__neg__) Complex::operator-(); // Unary -
Handling operators in this manner is mostly straightforward. However, there are a few subtle issues to keep in mind:
SWIG simply ignores all friend declarations. Furthermore, it doesn't know how to associate the associated operator+ with the class (because it's not a member of the class).class Complex { public: friend Complex operator+(Complex &, double); }; Complex operator+(Complex &, double);
It's still possible to make a wrapper for this operator, but you'll have to handle it like a normal function. For example:
%rename(add_complex_double) operator+(Complex &, double);
%module vector %{ #include "vector.h" %} class Vector { public: double x,y,z; Vector(); ~Vector(); ... bunch of C++ methods ... %addmethods { char *__str__() { static char temp[256]; sprintf(temp,"[ %g, %g, %g ]", v->x,v->y,v->z); return &temp[0]; } } };
This code adds a __str__ method to our class for producing a string representation of the object. In Python, such a method would allow us to print the value of an object using the print command.
The %addmethods directive follows all of the same conventions as its use with C structures.>>> >>> v = Vector(); >>> v.x = 3 >>> v.y = 4 >>> v.z = 0 >>> print(v) [ 3.0, 4.0, 0.0 ] >>>
Starting with SWIG-1.3.7, simple C++ template declarations can also be easily wrapped. For example, consider the following template class declaration:void foo(vector<int> *a, int n);
By itself, this template declaration is useless--SWIG simply ignores it because it doesn't know how to generate any code until unless a definition of T is provided.// File : list.h template<class T> class List { private: T *data; int nitems; int maxitems; public: List(int max) { data = new T [max]; nitems = 0; maxitems = max; } ~List() { delete [] data; }; void append(T obj) { if (nitems < maxitems) { data[nitems++] = obj; } } int length() { return nitems; } T get(int n) { return data[n]; } };
To create wrappers for a specific template instantiation, use the %template directive like this:
The argument to %template() is the name of the instantiation in the target language. Most target languages do not recognize identifiers such as List<int>. Therefore, each instantiation of a template has to be associated with a nicely formatted identifier such as intList or doubleList. Furthermore, due to the details of the underlying implementation, the name you select has to be unused in both C++ and the target scripting language (e.g., the name must not match any existing C++ typename, class name, or declaration name)./* Instantiate a few different versions of the template */ %template(intList) List<int>; %template(doubleList) List<double>;
Since most C++ compilers are nothing more than glorified preprocessors and C++ purists really hate macros, SWIG internally handles templates by converting them into macros and performing expansions using the preprocessor. Specifically, the %template(intList) List<int> declaration results in a macro expansion that generates the following code (which is then parsed to create the interface):
SWIG can also generate wrappers for function templates using a similar technique. For example:// Example of how templates are internally expanded by SWIG %{ // Define a nice name for the instantiation typedef List<int> intList; %} // Provide a simple class definition with types filled in class intList { private: int *data; int nitems; int maxitems; public: intList(int max) { data = new int [max]; nitems = 0; maxitems = max; } ~intList() { delete [] data; }; void append(int obj) { if (nitems < maxitems) { data[nitems++] = obj; } } int length() { return nitems; } int get(int n) { return data[n]; } };
In this case, maxint and maxdouble become unique names for specific instantiations of the function.// Function template templateT max(T a, T b) { return a > b ? a : b; } // Make some different versions of this function %template(maxint) max<int>; %template(maxdouble) max<double>;
If your goal is to make someone's head explode more than usual, SWIG directives such as %name and %addmethods can be included directly in template definitions. Not only that, since SWIG has the advantage of using the preprocessor for template expansion, standard C preprocessor operators such as # and ## can be applied to template parameters (an obvious oversight of the C++ standard that SWIG now corrects). For example:
In this example, the extra SWIG directives are propagated to every template instantiation.// File : list.h template<class T> class List { ... public: List(int max); ~List(); ... %name(getitem) T get(int index); %addmethods { char *__str__() { /* Make a string representation */ ... } /* Return actual type of template instantiation as a string */ char *ttype() { return #T; } } };
In addition, the %addmethods directive can be used to add additional methods to a specific instantiation. For example:
Needless to say, SWIG's template support provides plenty of opportunities to break the universe. That said, an important final point to note is that SWIG performs no extensive error checking of templates! Specifically, SWIG does not perform type checking nor does it check to see if the actual contents of the template declaration make any sense. Since the C++ compiler (or is it a preprocessor?) will definitely check this when it compiles the resulting wrapper file, there is no practical reason for SWIG to duplicate this functionality (besides, none of the SWIG developers are masochistic enough to want to implement this).%template(intList) List<int>; %addmethods intList { void blah() { printf("Hey, I'm an intList!\n"); } };
Finally, there are a few limitations in SWIG's current support for templates:
class List<int> { ... };
Although these kinds of pointers can be parsed and represented by the SWIG type system, few language modules know how to handle them due to implementation differences from standard C pointers. Readers are strongly advised to consult an advanced text such as the "The Annotated C++ Manual" for specific details.double do_op(Object *o, double (Object::*callback)(double,double)); extern double (Object::*fooptr)(double,double); %constant double (Object::*FOO)(double,double) = &Object::foo;
When pointers to members are supported, the pointer value might appear as a special string like this:
In this case, the hexadecimal digits represent the entire value of the pointer which is usually the contents of a small C++ structure on most machines.>>> print example.FOO _ff0d54a800000000_m_Object__f_double_double__double >>>
SWIG's type-checking mechanism is also more limited when working with member pointers. Normally SWIG tries to keep track of inheritance when checking types. However, no such support is currently provided for member pointers.
class Foo { public: #ifndef SWIG class Bar { public: ... }; #endif Foo(); ~Foo(); ... };
Also, as a rule of thumb, SWIG should not be used on raw C++ source files.
Although SWIG knows how to correctly deal with const in its internal type system and it knows how to generate wrappers that are free of const-related warnings, SWIG does not make any attempt to preserve const-correctness in the target language. Thus, it is possible to pass const qualified objects to non-const methods and functions. For example, consider the following code in C++:
Now, consider the behavior when wrapped into a Python module:const Object * foo(); void bar(Object *); ... // C++ code void blah() { bar(foo()); // Error: bar discards const };
Although this is clearly a violation of the C++ type-system, fixing the problem doesn't seem to be worth the added implementation complexity that would be required to support it in the SWIG run-time type system. There are no plans to change this in future releases (although we'll never rule anything out entirely).>>> bar(foo()) # Okay >>>
The bottom line is that this particular issue does not appear to be a problem for most SWIG projects. Of course, you might want to consider using another tool if maintaining constness is the most important part of your project.
The bare %{ ... %} directive is a shortcut that is the same as %header %{ ... %}.%runtime %{ ... code in runtime section ... %} %header %{ ... code in header section ... %} %wrapper %{ ... code in wrapper section ... %} %init %{ ... code in init section ... %}
Everything in a code insertion block is copied verbatim into the output file and is not parsed by SWIG. Most SWIG input files have at least one such block to include header files and support C code. Additional code blocks may be placed anywhere in a SWIG file as needed.
%module mymodule %{ #include "my_header.h" %} ... Declare functions here %{ void some_extra_function() { ... } %}
A common use for code blocks is to write "helper" functions. These are functions that are used specifically for the purpose of building an interface, but which are generally not visible to the normal C program. For example :
%{ /* Create a new vector */ static Vector *new_Vector() { return (Vector *) malloc(sizeof(Vector)); } %} // Now wrap it Vector *new_Vector();
The %inline directive inserts all of the code that follows verbatim into the header portion of an interface file. The code is then parsed by both the SWIG preprocessor and parser. Thus, the above example creates a new command new_Vector using only one declaration. Since the code inside an %inline %{ ... %} block is given to both the C compiler and SWIG, it is illegal to include any SWIG directives inside a %{ ... %} block.%inline %{ /* Create a new vector */ Vector *new_Vector() { return (Vector *) malloc(sizeof(Vector)); } %}
%init %{ init_variables(); %}
Unlike, #include, %include includes each file once (and will not reload the file on subsequent %include declarations). Therefore, it is not necessary to use include-guards in SWIG interfaces.%include "pointer.i"
By default, the #include is ignored unless you run SWIG with the -includeall option. The reason for ignoring traditional includes is that you often don't want SWIG to try and wrap everything included in standard header system headers and auxilliary files.
The purpose of %import is to collect certain information from another SWIG interface file or a header file without actually generating any wrapper code. Such information generally includes type declarations (e.g., typedef) as well as C++ classes that might be used as base-classes for class declarations in the interface. The use of %import is also important when SWIG is used to generate extensions as a collection of related modules. This is advanced topic and is described in a later chapter.%import "foo.i"
The -importall directive tells SWIG to follow all #include statements as imports. This might be useful if you want to extract type definitions from system header files without generating any wrappers.
In addition, SWIG defines the following set of standard C/C++ macros:SWIG Always defined when SWIG is processing a file SWIGTCL Defined when using Tcl SWIGTCL8 Defined when using Tcl8.0 SWIGPERL Defined when using Perl SWIGPERL5 Defined when using Perl5 SWIGPYTHON Defined when using Python SWIGGUILE Defined when using Guile SWIGRUBY Defined when using Ruby SWIGJAVA Defined when using Java SWIGMZSCHEME Defined when using Mzscheme SWIGWIN Defined when running SWIG under Windows SWIGMAC Defined when running SWIG on the Macintosh
Interface files can look at these symbols as necessary to change the way in which an interface is generated or to mix SWIG directives with C code. These symbols are also defined within the C code generated by SWIG (except for the symbol `SWIG' which is only defined within the SWIG compiler).__LINE__ Current line number __FILE__ Current file name __STDC__ Defined to indicate ANSI C __cplusplus Defined when -c++ option used
you may get some extra constants such as _FOO_H showing up in the scripting interface.#ifndef _FOO_H 1 #define _FOO_H 1 ... #endif
More complex macros can be defined in the standard way. For example:
The following operators can appear in macro definitions:#define EXTERN extern #ifdef __STDC__ #define _ANSI(args) (args) #else #define _ANSI(args) () #endif
The primary purpose of %define is to define large macros of code. Unlike normal C preprocessor macros, it is not necessary to terminate each line with a continuation character (\)--the macro definition extends to the first occurrence of %enddef. Furthermore, when such macros are expanded, they are reparsed through the C preprocessor. Thus, SWIG macros can contain all other preprocessor directives except for nested %define statements.%define ARRAYHELPER(type,name) %inline %{ type *new_ ## name (int nitems) { return (type *) malloc(sizeof(type)*nitems); } void delete_ ## name(type *t) { free(t); } type name ## _get(type *t, int index) { return t[index]; } void name ## _set(type *t, int index, type val) { t[index] = val; } %} %enddef ARRAYHELPER(int, IntArray) ARRAYHELPER(double, DoubleArray)
The SWIG macro capability is a very quick and easy way to generate large amounts of code. In fact, many of SWIG's advanced features and libraries are built using this mechanism (such as C++ template support).
the contents of the %{ ... %} block are copied without modification to the output (including all preprocessor directives).%{ #ifdef NEED_BLAH int blah() { ... } #endif %}
By default, SWIG will interpret the #ifdef DEBUG statement. However, if you really wanted that code to actually go into the wrapper file, prefix the preprocessor directives with % like this:%addmethods Foo { void bar() { #ifdef DEBUG printf("I'm in bar\n"); #endif } }
SWIG will strip the extra % and leave the preprocessor directive in the code.%addmethods Foo { void bar() { %#ifdef DEBUG printf("I'm in bar\n"); %#endif } }
Although this may sound complicated, the process turns out to be fairly easy once you get the hang of it.
In the process of building an interface, SWIG may encounter syntax errors or other problems. The best way to deal with this is to simply copy the offending code into a separate interface file and edit it. However, the SWIG developers have worked very hard to improve the SWIG parser--you should report parsing errors to swig-dev@cs.uchicago.edu or to the SWIG bug tracker on www.swig.org.
A typical SWIG interface file for this header file would look like the following :/* File : header.h */ #include <stdio.h> #include <math.h> extern int foo(double); extern double bar(int, int); extern void dump(FILE *f);
Of course, in this case, our header file is pretty simple so we could have made an interface file like this as well:/* File : interface.i */ %module mymodule %{ #include "header.h" %} extern int foo(double); extern double bar(int, int); extern void dump(FILE *f);
/* File : interface.i */ %module mymodule %include header.h
Naturally, your mileage may vary.
%module graphics %{ #include <GL/gl.h> #include <GL/glu.h> %} // Put rest of declarations here ...
Getting rid of main() may cause potential initialization problems of a program. To handle this problem, you may consider writing a special function called program_init() that initializes your program upon startup. This function could then be called either from the scripting language as the first operation, or when the SWIG generated module is loaded.
As a general note, many C programs only use the main() function to parse command line options and to set parameters. However, by using a scripting language, you are probably trying to create a program that is more interactive. In many cases, the old main() program can be completely replaced by a Perl, Python, or Tcl script.