DOC DATE 19991229

WHAT

cLIeNUX implements what I call the Dotted Standard Filename Hierarchy. The main directory names in the root directory of a cLIeNUX filesystem, /, are represented by a verbose and carefully chosen visible name and a dot-prefixed symlink to it. For example, the actual name of the directory tree that contains most cLIeNUX user commands is "/command", and "/.bi" is a symlink to that. This is quite off-putting to people already familiar with the traditional UNIX directory names, but implementing this was a lot of work, and I have reasons for inflicting something so non-standard on myself. I had wanted to do something like this for a long time, and waited until a simple implementation method occured to me. The support needed for the DSFH implemented this way was less than I expected, which I take as a sign that it wants to be like this. The DSFH is the actual directory structure, a simple kernel patch, a substantial edit of the system paths.h file, and some simple conversion scripts to run on sourcecode packages and so on. Unfortunately, because this impinges on everything a user sees and interacts with, a very long explanation is called for for something that is a small amount of simple code.

There's a filesystem names standard extant in the Linux world. The DSFH is totally non-conformant with it, on first pass, but is designed in such a way that standard packages can be automatically converted to the DSFH. A filesystem naming standard is mostly so that applications can assume certain things about the locations of things on the system they are being built on. Even the kernel needs to know where exactly to find an init or shell program at boot-up, and there is also a kernel dependancy on the exact pathname of a modules utility. The advantages I see for the DSFH are such that I submitted a patch to the kernel developers (which was not accepted, yet :o)) to support DSFH names in the kernel so one could boot to /.sbi/init, which is what cLIeNUX boots to. I therefor maintain a slightly non-standard kernel for cLIeNUX. The cLIeNUX DSFH kernel patch is quite trivial code-wise, but necessary for the DSFH. The crucial part is 2 lines of very simple C with no impact that I have seen on any other kernel functionality. The "Standard" in DSFH is because the dotted names are designed such that they can be generated or converted simply and consistantly from the actual standard names by scripts run on source packages, and even binaries and libraries.

rationale

WHY???... Because "sbin", "bin", "etc" and "usr" are meaningless, particularly given what they each supposedly "represent" in a typical unix filesystem. Because machines, unlike humans, can handle names like "/.tm" just as well as "/scratch". Because machines exist to do things like figure out what "/.sbi" means; that's why people use computers. Because naming is generally terrible throughout the unix world. Because your shell probably has filename completion. Because your terminal is faster than 300 baud and isn't steadily consuming trees. Because all filenames "plain users" see should be in the user's native language and character set. Because "normal users" do ls / , and do so a lot on most Linux boxes. Because a unix shell prompt is more powerful than a GUI. Because lucid naming is the best documentation.

The DSFH creates two names for each of the most important subdirectories of /, and for some other important directories. The dot-prefixed name is for the machine, the symlink to it is for the user. ls and other ubiquitious unix directory utils normally don't display filenames beginning with a dot by default. This means DSFH directories have both a "visible" and an "invisible" name. This is analagous to the infrastructure of most technologies people have used for centuries, such as architecture, where things humans need to interact with regularly are made visible, convenient, simple, familiar and aesthetic, and the utilities are hidden, but accessible to those who maintain them. Install scripts, shell scripts, and binaries that call specific files by full pathname see "/.et"; the user, doing an ls /, sees "/configure". If there was a DSFH system in Hindi, the user would see the Hindi translation of "configure", but her applications would see the same "/.et". Hindi is just picked out of the blue, and I'm assuming here that a unix filesystem can have filenames that represent Hindi words, but I suspect even Chinese could be supported. This internationalization potential at such low cost is a very big win. Also, with the DSFH the internationalization occurs at the right level, the user level. The infrastructure and internals of a unix system will continue to emphasize English, but internationalizing userland reduces the internationalization pressure on the infrastructure of the system, i.e. it reduces the need for people to re-write EVERYthing down to C "int" in thier native language. Do you want to try to import a cryptography utility package written in a Mandarin version of C?

For a binary to expect a file at a specific full pathname is bad practice. Many do so, however. Such sinful works can be converted to the DSFH, usually with one command per package. However, I have no problem with creating a slight added annoyance for apps that want "/lib/cpp" exactly, for example.

cLIeNUX differs from other unices in many respects, and the DSFH helps reduce the impact of those other differences. The commands that are usually in /sbin in traditional layouts are in /command/background in cLIeNUX. However, that is symlinked to /.sbi. Occurences of /sbin in a standard package get converted to /.sbi, so installs and such that are looking for a top level directory find one. This allows cLIeNUX to keep all commands somewhere under /command, while allowing imports to find everything. In other words, the symlink in a DSFH dirname means traditional root-level directories can actually reside in DSFH subdirectories. Similarly, cLIeNUX tries to not enforce C as the only system language. cLIeNUX is based on ideas from the Forth programming language, and a cLIeNUX script can assume a Forth is present. That means that "include" isn't necessarily C includes, and has other naming ramifications. The traditional /usr/include is /source/C/include in cLIeNUX, courtesy of DSFH symlinks. In other words, the DSFH adds flexibility to something that is otherwise far too rigid.

yeah, BUT IT'S NOT UNIX!"

It's not unix frozen at some particular point in time, which is a more "not-unix" idea than the DSFH. It's only "not unix" to the extent that it's post-unix. A DSFH-patched kernel breaks nothing, and the kernel patch rejected by Linus Torvalds looks for init and sh in the traditional full pathnames before trying the DSFH versions. You can in fact probably have more than one Linux on the same partition with a DSFH and a regular setup, but that's a bit weird even for me. I only mention that to emphasize that the DSFH breaks nothing.

It's not Dos either. That's a common knock; that I should run Dos if I want something like this. Well, if you think that cLIeNUX or the DSFH is Dos that's your problem.

HOW

Here's a clean cLIeNUX DSFH ls -F / and an ls -aF /...


	ABOUT.ABOUT   README        dev/          log/          scratch/      user/
	ABOUT.kernel  cLIeNUX.jpg   floppy/       lost+found/   source/
	CD/           command/      help/         mount/        subroutine/
	COPYING       configure/    kernel/       owner/        suite/


	./            .mn@          .us@          README        help/         scratch/
	../           .pro@         .va@          cLIeNUX.jpg   kernel/       source/
	.bi@          .roo@         ABOUT.ABOUT   command/      log/          subroutine/
	.bootlog      .sbi@         ABOUT.kernel  configure/    lost+found/   suite/
	.et@          .terminfo/    CD/           dev/          mount/        user/
	.li@          .tm@          COPYING       floppy/       owner/
and here's file .??* on the above...


	/.bi:       symbolic link to command/
	/.bootlog:  ASCII text
	/.et:       symbolic link to configure/
	/.li:       symbolic link to subroutine/
	/.mn:       symbolic link to mount
	/.pro:      symbolic link to kernel/
	/.roo:      symbolic link to owner/
	/.sbi:      symbolic link to /command/background/
	/.terminfo: directory
	/.tm:       symbolic link to scratch/
	/.us:       symbolic link to suite/
	/.va:       symbolic link to log

Note that the .names are created from the standard names by a simple and regular conversion. You take the stock name, "var" for example, prefix a period and drop the last character. The resulting /.va/ in cLIeNUX is then symlinked to /log/, but imported packages don't need to know that. /log/ could be Latvian. There's a one-to-one correspondance between Linux Filesystem Standard names and DSFH names, if a DSFH equivalent exists.

Now here's the sick hack part; there's also the same number of characters in a DSFH name and it's LFS parent. That means a simple ed script can be used to convert *binaries* to the DSFH.

This is the two lines the kernel needs, in ../linux/init/main.c, to boot to a DSFH box, with some context, and some other lines that wouldn't be needed if the DSFH was in the main kernel distribution, and some other edits I did "while I was at it"...


        printk("cLIeNUX Dotted Standard File Hierarchy supported\n");
        execve("/sbin/init",argv_init,envp_init);
        execve("/etc/init",argv_init,envp_init);
        execve("/bin/init",argv_init,envp_init);
        execve("/bin/sh",argv_init,envp_init);
*        execve("/.sbi/init",argv_init,envp_init);
*        execve("/.bi/sh",argv_init,envp_init);
        printk("Couldn't find init or shell.\n");
        printk("Your bootloader may accept an  init= argument to tell ");
        printk("it what to run.\n");
        panic("Sorry. This is the last snag in the kernel though.");
}


The scripts that convert a standard package to the DSFH are ed scripts, which do the name translations explained above on every file in a dir, and also look for package subdirectories named "lib" and convert that, as is the case in the ncurses package, for example.

Here's a piece of cLIeNUX'es path.h file, /source/C/include/path.h...


	/* Default search path. */
	#define _PATH_DEFPATH           "/.sbi:/.bi"
	#define _PATH_DEFPATH_ROOT      "/.bi:/.sbi:/.bi/alias:/.bi/trad:"

	#define _PATH_BSHELL    "/.bi/sh"
	#define _PATH_CONSOLE   "/dev/console"
	#define _PATH_CSHELL    "/.bi/csh"
	#define _PATH_DEVDB     "/.va/run/dev.db"
	#define _PATH_DEVNULL   "/dev/null"
	#define _PATH_DRUM      "/dev/drum"
	#define _PATH_HEQUIV    __PATH_ETC_INET"/hosts.equiv"
	#define _PATH_KMEM      "/dev/kmem"
	#define _PATH_MAILDIR   "/.va/spool/mail"
	#define _PATH_MAN       "/.us/man"
	#define _PATH_MEM       "/dev/mem"
	#define _PATH_LOGIN     "/.bi/login"
	#define _PATH_NOLOGIN   "/.et/nologin"
	#define _PATH_SENDMAIL  "/.sbi/putmail"
	#define _PATH_SHELLS    "/.et/shells"
	#define _PATH_TTY       "/dev/tty"
	#define _PATH_UNIX      "/boot/active_kernel"
	#define _PATH_VI        "/.bi/ed"
	#define _PATH_PRESERVE  "/.va/preserve"

(Hmmmm, /dev/drum?)

Some have argued that things like the DSFH patch are best left to bootloaders. I disagree. Two lines of kernel code is the simplest method, and thus clearly the best. I frequently don't use a bootloader.

DOWNSIDES

The cluster of .names in / is unfortunate, but it can't be anywhere else. I doubt that's true of the many .names in / in some distros. There's a performance hit too, if you go through the symlink, but you're already doing a syscall if you are talking to a file, so I doubt that the performance hit is significant. And if you're in a hurry you probably don't need the symlink.

GREMLINS

None confirmed. The scripts that convert imports to the DSFH have a miniscule chance of incorrectly modifying binary object code if said scripts are run on binaries. I ran these scripts on everything but the kernel to convert over, including my libc5 binary and so on, and I'm not aware of any bugs generated that way, after several months of use. The scripts look for e.g. "/bin/" in a file and convert it to "/.bi/", and the chance of "/bin/" occuring as random bytes, i.e. in binary machine code, is one in 256 ^ 5. A detailed analysis of the actual probabilies with the several converted names and some cognizance of the character frequencies in a certain CPU's machine code might be somewhat less infinitessimal, but still tiny. If you can't take even that small a risk, rebuild from source.

SOUR GRAPES

Linus Torvalds is needs-driven. Linux is the success it is because it was needed and timely; not because Linux itself is wildly original, which it isn't. I can hardly blame him for not including a patch that makes life slightly easier for just one of Linux's millions of users. Well, I can, but not with much enthusiasm. I think the DSFH is worth a pro-active move, a non-needs-based move, even to Linus, due to the internationalization aspects of it, but he thinks otherwise. So, such moves are left for the likes of myself. And you, perhaps. I'm promoting this in the hopes that people look at cLIeNUX, or adopt the DSFH in thier Linux distribution or other unix or unix-offspring. I'm quite convinced that it's worth the hassle. It's not as bad as it looks, and sensible unix pathnames are very refreshing, even in English.

Copyright 19991229 Rick Hohensee
Hereby released into the public domain.