stdin, stdout, stderr

This is an elaboration on the Bash manpage section on redirection. The manpage on the 1971 version of the UNIX sh is also helpful. That document predates stderr, but includes remarks that predict it. There is also a stdin libc call.

Command IO redirection is obviously a very powerful and important aspect of the design of unix. It is crucial to the modularity of commands, and to the design idea that "everything is a file". It is the glue that the user connects the modules with, along with pipes. As I begin this page however, I personally lack a satisfactory understanding of the workings of <<, >& and so on. Lets see if we can fix this.

Basically every process in unix is assumed to have three standard files opened when it is run. A process keeps track of what files it has open with integers called file descriptors. These file descriptors are known to the process and to the shell. Every process is assumed to have file descriptors 0, 1 and 2. 0 is usually standard input, 1 is usually standard output, and 2 is usually standard error. The default for all 3 of these is the terminal device special file the process is associated with. An interactive shell gets user input from the terminal "on file descriptor 0", sends output such as the output of internal commands to the terminal via stdout "on file descriptor 1", and sends meta-information such as the shell prompt and error messages to stderr "on file descriptor 2".

Processes run under the control of the shell that spawns them. This control includes the shell's ability to redirect the source and destination of file input-output to and from the process. The output and input of a process may be redirected with <, >, >>, <<, >&, and others I won't address here. <, >, >>, and >& accept numerical specifiers for which file desriptor they are to act upon. >& requires them.

This means a shell command line is instructions to the command AND to the shell to control the command. The shell also does wildcard expansion and other macro processing before the command gets to run. This is very complicated, so lets take an extremely simple case that only deals with the redirection aspect.

Let's consider a command that does nothing. To make sure it does nothing, we'll write it ourselves. We can edit a file called null.c with contents consisting of nothing but...

main(){}

That's less C code than "Hello World", the stock example of a minimal program. null.c doesn't even print anything. All the main() does is give it an entry point to do the contents of {}, namely nothing. Without main() it can't be compiled and linked into a command. Libraries and the kernel don't have main() because they aren't commands.

gcc null.c

will produce a 4k executable named a.out that does nothing. What it will do, however, is accept redirection directives. None of the following idiocies produce any error messages from Bash;

a.out < /bin/Lynx
a.out 2< /lib/libc.so.5
a.out 7< /lib/libc.so.5
a.out >> BLAH
a.out 9>> BLAH
a.out > BLAH
a.out 9> BLAH
but be careful. Outputting nothing to a file is definitely doing something. In the last 2 cases the pre-existing contents of BLAH would get erased. I just did exactly that to null.c by accident. Good thing it wasn't /bin/lynx. There is in fact an existing null command that's handy when that's what you want to do, the Bash : builtin.

I would describe the above command lines as follows....
a.out < /bin/lynx
Do a.out using input on file descriptor 0, the default stdin, from the file /bin/lynx
a.out 2< /lib/libc.so.5
Do a.out using input on file descriptor 2 from the file /lib/libc.so.5
a.out 7< /lib/libc.so.5
Do a.out using input on file descriptor 7 from the file /lib/libc.so.5
a.out >> BLAH
Do a.out, sending it's output, by default from file descriptor 1, stdout, to the end of the file BLAH
a.out 9>> BLAH
Do a.out, sending it's output from file descriptor 9 to the end of the file BLAH
a.out > BLAH
Do a.out, sending it's output, by default from file descriptor 1, stdout, to the "beginning" of the file BLAH, thus overwriting BLAH
a.out 9> BLAH
Do a.out, sending it's output from file descriptor 9 to the "beginning" of the file BLAH, thus overwriting BLAH

The point I'm trying to establish is that there is some sense to the syntax, and it's clearer if you're aware of the numerical file descriptor aspect. The defaults make more sense when you see what they are defaults for.

Also, it appears that redirection is somewhat laisez-faire between the shell and the command. We're sending a.out stuff on file descriptors it doesn't have, and nobody is complaining. The sent data just goes to the bitbucket. This is an absurd case, however. The defaults are set up like they are because almost everything useful does have some input, output and error messages. In fact, an strace of our example a.out shows that it opens library linking stuff on file descriptor 3, implying that it's already used descriptors 0, 1 and 2. In other words, it does have stdin, stdout and stderr, they just don't do anything. But they're quite standard, being built by gcc even for null.c.

Here's some more acceptable idiocy.....
$a.out <<DUH
> dlfk vdkf vldkfv dflkv dl fvfv
> dlkf vldkfvld vfvd fvldk vdlf v
> DUH
$
That's an interactive "here document" sent to a.out, which does nothing with it. To keep with my point about defaults, it works like this too...

$a.out 4<<DUH
sending the current input to the shell, my typing, to file descriptor 4 of a.out, until another DUH is encountered.