Previous: Database Formats, Up: Databases



5.3 Newline Handling

Within the database, filenames are terminated with a null character. This is the case for both the old and the new format.

When the new database format is being used, the compression technique used to generate the database though relies on the ability to sort the list of files before they are presented to frcode.

If the system's sort command allows its input list of files to be separated with null characters via the -z option, this option is used and therefore updatedb and locate will both correctly handle filenames containing newlines. If the sort command lacks support for this, the list of files is delimited with the newline character, meaning that parts of filenames containing newlines will be incorrectly sorted. This can result in both incorrect matches and incorrect failures to match.

On the other hand, if you are using the old database format, filenames with embedded newlines are not correctly handled. There is no technical limitation which enforces this, it's just that the bigram program has no been updated to support lists of filenames separated by nulls.

So, if you are using the new database format (this is the default) and your system uses GNU find, newlines will be correctly handled at all times. Otherwise, newlines may not be correctly handled.