# Hey, Researcher! Open Source Tools For You!

**Disclaimer:**

Your author is writing this article after
having just finished his latest task - conducting numerous experiments
which involved processing a lot of data, and finally writing the paper on
it in just under five days. He wonders how he would do it without some of
the tools he will be talking about in this article, leaving aside the
pleasure of working on Linux systems. His views are biased and are meant to
be taken *cum grano salis*, but I hope you will give him a fair
chance.

### What am I addressing in this article?

My meeting with Linux happened accidentally 10 years ago, with the fun of
wiping out and putting in a Linux distribution numerous times on my first
computer without having any clue about how to work on it. Finally, after a
lot of such iterations I finally settled in, working on it full
time for the last 5 years. The accident that happened 10 years stood me in
good stead both when I was working at a software company and now when I'm
working in a research lab. In a research lab, we need to plot figures, run
batch jobs remotely, perform scientific and numerical computing, and last
but not least, publish results by writing papers. Just a few days before I
wrote to the Editor-in-Chief about this article idea, I helped a Windows
friend draw some plots (fixing them for publication) for her paper,
wondering all the while "Jeez, how do Windows users do it?" Most tools that
I used were primarily for Linux. On Linux, things are so much easier. You
can focus on the real job instead of bothering about setting up your tools.
This article is therefore an unabashed claim about the coolness of working
with these tools while you focus on your research. Some of these tools can
be set up on a non-Linux system too; however, I shall address their usage
on a Linux system only and make no attempt to cover any non-Linux issues.
The primary focus here is to introduce my readers to these tools, and not
to convert them to Linux. *It would, in fact, be a good exercise to get
all these tools that I talk about here on Windows!*

In this article, I shall take a look at (in no particular order) getting up
and on with LaTex, *gnuplot*, *GNU octave*, Python scientific
libraries, *Beamer*, and some other tools and libraries.

### Hello Linux

Before I start talking about using the scientific tools, I shall write a
few lines about how can you set up a Linux system for yourself - if you
don’t already have one, that is. While bare-metal Linux installations
are the best case scenario, it may not always be possible because your
primary work may be on Windows. In such cases, the best solution is to go
for a *virtual Linux installation*. In my humble opinion, I have
found using VirtualBox to be a
breeze when you need to set up a virtual machine. Just download a Linux
distribution like Ubuntu and install it using VirtualBox. Once you have the
basic installation, take a little time to gain some hands on familiarity
with the terminal, using the package manager to install packages, and using
a text editor.

Here are some resources which can help you getting started:

### Giving LaTex a try

LaTex is a document preparation system for high-quality typesetting and is
widely used for writing scientific papers, technical manuals, and even
creating presentations *(more on this later)*. *TeX Live* is an easy way to get up
and running with LaTex on your Linux system. A file written in LaTex is
usually saved as a *.tex* file. Once you have created your
*.tex* file, you will compile the file using the *latex*
command, which gives you a *dvi* file, which you can convert to a PDF
file using a tool like *dvipdf*. You can also directly get a PDF
output using the tool *pdflatex*.

You can edit LaTex in your text editor of choice. The widely used
*VIM* and *Emacs* both support LaTex editing with
syntax-highlighting, code-completion, etc.

The best way to get started with LaTex is by getting started. Start your text editor, type in some Latex and see the results. Here are some great resources to help you master it:

- The (Not So) short introduction to LaTex
- Text Processing using LaTex
- Draw the LaTex symbol and get the corresponding command

### gnuplot

If you are doing experimental work, there is a fair chance that you are
going to present your results in a graphical plot. Please welcome *gnuplot*.
It is a command-line driven graphing utility for creating 2-dimensional as
well as 3-dimensional graphical plots. It supports output to many file
formats with *eps* and *fig* being of special interest to the
researcher. The easiest way to install *gnuplot* is using the package
manager of you Linux distribution.

Here are a couple of links to help you get started with
*gnuplot*:

#### Calling *gnuplot* from your C program

A cool way to use *gnuplot* is to use it to display continuously
changing data that your program may be generating. For this example,
let’s assume that you have a program written in the C programming
language and you want to send the data its generating to *gnuplot*.
The simplest way to do this is to use the *popen* system call. Please
refer to this
*2-cent tip* published here sometime back.

There is also a ANSI C interface to *gnuplot* available here, but I haven’t personaly
used it.

### GNU Octave

GNU Octave (referred to as Octave from now on) is a software tool in- tended for numerical computation. Octave has extensive tools for solving common numerical linear algebra problems, finding the roots of nonlinear equations, integrating ordinary functions, manipulating polynomials, and integrating ordinary differential and differential-algebraic equations. Octave is available for use on Linux, Solaris, Mac OSX, and Windows. If your study or play involves numerical computations, Octave is for you. Octave has been designed with MATLAB compatibility in mind, so with careful design you could write scripts which run on both MATLAB and Octave. It is easily extensible and customizable via user-defined functions written in Octave’s own language, or using dynamically loaded modules written in C++, C, Fortran, or other languages.

Octave is most definitely available in the package repository of your Linux distribution and that is the easiest way to get it.

GNU Octave comes with extensive documentation, which is the best place to get started.

In case you happen to like this author’s writings, he is currently writing an article series on Octave elswhere. Please refer to these blog post for PDFs of the articles.

### Add Python to your toolkit

I shall not use this section of the article for Python language advocacy. Rather, I shall point you to some ways you may consider using Python in your research, if you are already using it, or think of using it in the future as just another of your many experiments. Python is already installed on most of the Linux distributions. Let us now see some of the ways you can use Python:

#### As a scientific calculator

With its *math* module, Python
can be a very useful command-line scientific calculator. For
example:

$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import math >>> math.factorial(10) 3628800 >>>

As you can guess, you will have to import the math module every time you want to use it. You can avoid doing this by automatically importing it at startup:

- Create a file: .pythonrc in your $HOME and place this line:
*import math* - Now in your .bashrc or similar:
export PYTHONSTARTUP= $HOME/.pythonrc

Now, every time you start Python interactively, you should have the
*math* module already imported.

$ python Python 2.6.4rc1 (r264rc1:75270, Oct 10 2009, 02:40:56) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> math.pi 3.1415926535897931

The *decimal* module is
another interesting module which provides support for decimal fixed point and
floating point arithmetic. *fractions* is
another interesting module, which can be used for rational number
arithmetic.

The beauty is that you do not need to know the Python *programming
language* for using these modules. Just fire up the Python interpreter
and import the module(s) you want to use, then call the appropriate
function.

#### Scientific libraries

If you are already conversant with Python, then you would definitely like to know about some scientific libraries which you may find useful in your research:

**NumPy and SciPy**:*NumPy*is an extension to the standard Python programming language, adding support for large multi-dimensional arrays and matrices.*SciPy*builds upon*NumPy*to provide various mathematical functions for Integration, Optimization, Interpolation, Linear Algebra, Statistics, and others.**matplotlib**: This is a plotting library for Python, which is also used by a lot of other Python scientific tools for plotting data.**NetworkX**:NetworkX is a Python library for the creation, manipulation, and study of complex networks. Even if you do not have the need to build very complex networks, it is fun to play around with it to create simple graphs too.

### Beam it with Beamer!

You have done your experiments, sent in your findings, they've
been accepted, and now you are going to present them at a
conference. You want to prepare some slides to give you company as you talk
about your work. OpenOffice.org Impress, KOffice? How about taking a look at
something different? You will surely like the results, I promise. *Beamer* is a LaTex class
for making presentation slides. In LaTex parlance, it is just another
Document Class. You shouldn’t even make an attempt to use Beamer, if
you have no idea what LaTex is. However, if you already use LaTex, chances
are you swear by it, then Beamer is for you.

The easiest way to install is to use the package manager in your Linux distribution. You will see that some more packages providing other extra LaTex classes need to be installed. Prominent among these are ’latex-xcolor’ and ’pgf’.

There is, of course, a way to install the LaTex class manually. It's easy to do it, but not easier than the previous method. The LaTex Beamer user guide shows how to do it.

With the danger of being pulled up by the chief editor for self-publicity,
the author would like to point to his article *Typesetting Presentations with
Beamer*, published elsewhere. The PDF copy is available from here.

### Others you may find useful

In this section, I shall talk about some of the other tools, which are useful for scientific computing that I know of:

**Sage**:*Sage*is an open-source mathematics software*system*which brings together a huge number of open-source mathematical libraries and tools under one common umbrella. In their own words, the Sage community is building a car, and not reinventing the wheel. The Sage tutorial is a good place to start exploring it.**Xfig**: Xfig is a drawing tool which may be used to create figures, import and export figures in various other formats.*fig*is the native format that Xfig uses to save its images. A useful way to use*Xfig*that I have learnt is to export your plots from*gnuplot*as*.fig*images, open them in*Xfig*to add the relevant text/labels to your plots, then export the plot in a format you desire.**GNU Scientfic Library**: The GNU Scientific Library is a C/C++ numerical library providing support for Linear Algebra, Complex Numbers, Numerical Differentiation and others.

### The small ones

One category of tools that I haven't talked about yet is some of the Linux
utilities, the ones that ship along with the Linux system- *Shell
scripting*, *awk* and *sed*. My knowledge of each is very
limited, but I have a working knowledge of them all and I can get things done
with them, which is what matters a lot of times.

**Shell scripting**: Shell scripting is very useful for running batch jobs- fire up a job and it should finish without requiring any intermediate user intervention. Its all the more useful when you have to run an experiment of yours a large number of times. Writing a shell script is often as simple as just writing five separate commands into one file in a*for*loop and running that file. Ofcourse, that's oversimplification, but sometimes it is*that*simple to setup a batch job. Do not get put away by the title, but the Advanced Bash-Scripting guide is a good place to get started with scripting for BASH shells.**awk**: The only way I have used*awk*so far (and a lot of times) is for text extraction from huge files containing numerical and textual data. Even with this limited use, it can be a useful tool when you want to extract text that is stored in a tabular format. The GNU Awk User's Guide is the place to get started to explore*awk*.**sed**:*sed*is a stream editor and is useful to carry out batch editing as well as editing of individual text files. This three-part series on sed is very comprehensive treatement of the features of*sed*.

### Conclusion

Its often the case that we do not know of a great tool which would simplify our work as we go about our research work everyday. My hope as a novice researcher with this article is to acquaint the kind of software tools available in the Open Source world which are very useful, capable and fun to work with and we have looked at some of these today. I have made no attempt to be exhaustive here. There are a lot of other tools out there, which I haven’t yet used, so I have refrained from writing about them. I hope you have enjoyed reading this article as much as I have writing this one!

### Acknowledgements

Credits are due to the folks on the mailing lists and the documentation of all the projects that I have talked today, because they are the source of my learning.

This document was translated from L^{A}T_{E}X byH^{E}V^{E}A.

Talkback: Discuss this article with The Answer Gang

*
The author is a part-time freelance technical writer from India and he
writes on all things Linux and Open-Source-y.
*