LINUX GAZETTE
[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]

"Linux Gazette...making Linux just a little more fun!"


Using RPM: The Basics (Part I)

By Jose Nazario


This documentation is designed to serve as a brief introduction to the Red Hat Package Management system, or RPM. Part 1 will cover installation of RPM packages, while part 2 will cover building your own RPM packages for distribution. We will cover what RPM is, why you would want to use it, compare it to other packaging systems on Linux and UNIX, and how to get it. The bulk of the time will be spent on how to use RPM for installing, checking packages, and removal of installed packages. Neither section will cover the RPM API.

What is RPM?

The Red Hat Package Manager (RPM) is a toolset used to build and manage software packages on UNIX systems. Distributed with the Red Hat Linux distribution and its derivatives, RPM also works on any UNIX as it is open source. However, finding RPM packages for other forms of UNIX, such as Solaris or IRIX, may prove difficult.

Package management is rather simple in its principles, though it can be tricky in its implementations. Briefly, it means the managed installation of software, managing installed software, and the removal of software packages from a system in a simplified manner. RPM arose out of the needs to do this effectively, and no other meaningful solution was available.

RPM uses a proprietary file format, unlike some other UNIX software package managers. This can be problematic if you find yourself needing to extract one component from the package and you don't have the RPM utility handy. Luckily a tool like Alien exists to convert from RPM to other formats. It can be possible, through tools like Alien, to get to a file format you can manage using, say, tar or ar.

The naming scheme of RPM files is itself a standardized convention. RPMs have the format (name)-(version)-(build).(platform).rpm. For example, the name cat-2.4-7.i386.rpm would mean an RPM for the utility "cat" version 2.4, build 7 for the x86. When the platform name is replaced by "src", it's a source RPM.

Why Package Management?

At first glance you may say to yourself, "I can manage this myself. It's not that many components ..." In fact, for something as small as, say, cat, which has one executable and one man page, this may be so. But consider, say, KDE, which has a mountain of components, dependencies, and likes to stick them everywhere. Keeping track of it all would be tough, if not impossible.

Package management makes it all easier. By letting a program maintain the information about the binaries, their configuration files, and everything else about them, you can identify which ones are installed, remove them easily or upgrade them readily, as well.

Installation becomes a snap. You select what you want, and ask the system to take care of the dirty work for you. Unpack the program, ensure that there is space, place things in the right order, and set them up for you. It's great, it's like having a valet take care of your car when you go to a restaraunt. Dependencies, or additional requirements for a software package, are also managed seamlessly by a good package manager.

Management of installed packages is also greatly facilitated by a good package management system. It keeps a full list of software installed, which is useful to see if you have something installed. More importantly, it makes upgrading a breeze. Lastly, this makes verification of a software package quite easy to do. By knowing what packages are installed, and what the properties of the components are, you can quickly diagnose a problem and hopefully fix it quickly.

How Does RPM Compare to Other UNIX Package Systems?

I have had the (mis)fortune of working with many UNIX variants, and gaining some experience with their package formats. While I sometimes slag RPMs, in comparison to other UNIX software packaging formats I find it usually comes out on top for my needs. Here's a brief rundown on the pro's and cons of some of the other formats and tools:
 
Format Platform Pro Con
inst IRIX(SGI) great graphical installer amazingly slow, frequentl reboots after, no network installs (aside from NFS)
sw HPUX(HP) (are there any?), also supports net installs terribly slow
pkg BSD(many)  tarballs, net installs lack signatures, sums
? Solaris(SUN) (are there any?)  slow, lack signatures, sums
.deb Debian just ar's, easy to extract w/o tool  lacks signatures

In brief, my biggest complaint about RPM is the lack of a solid GUI interface to it. While a few exist (like gnorpm and glint), they lack the more complex features that SGI's Software Manager has. Overall, I find that RPM has better conflict handling and resolution than inst does, and is much, much faster. Hence, i'm willing to live without a strong GUI.

My biggest raves for RPM, however, are in speed and package checking, using both package signatures and the sums of the components. As an example, once I had to reboot an SGI just because I reinstalled the default GUI text editor (aka jot). It also took approximately 15 minutes to reinstall that small package, before the reboot.

RPM In a Nutshell

Much like a compressed tarball, RPM uses a combination of rolling together multiple files into one archive and compression of this archive to build the bulk of the RPM package. Furthermore, additional header information is inserted. This includes pre- and post-installation scripts to prepare the system for the new package, as well as information for the database that RPM maintains. Dependencies are checked before any installation occurs, and if the appropriate flags are set for the installation they are also installed.

It is this database that makes RPM work the magic that it does. Stored in there are all of the properties of the installed packages. Should this become corrupted, it can be rebuilt using the rpm tool.

The Hows of RPM

We will now focus on the three core actions of RPM discussed in this documentation. They include installation of new packages, management of installed packages, and package removal. We will begin at the beginning, and how to add a package using RPM.

Installation Using RPM

This is the most basic RPM function, and one of the most popular: the installation of new software packages using RPM. To do this, give rpm the -i flag and point it to an RPM:

        rpm -i (package)

This will install the package if all goes well and send you back to a command prompt without any messages. Pretty boring, and worse if you want to know what happened you're out of luck. Use the -v flag to turn on some verbosity:

        rpm -iv (package)

All that gets printed out is the package name, but no statistics on the progress or what it did. You can get a hash marked output of the progress is you use the -h flag. People seem to like using -ivh together to get a "pretty" output:

        rpm -ivh (package)

Again, this doesn't tell you much about what just happened, only that it hopefully did. Hence, I usually crank up the verbosity (-vv) when I install. This lets me see what's going on:

        rpm -ivv (package)

While the output usually scrolls, I can see exactly what happened and if any problems were encountered. Plus I get to see where stuff was installed.

Dependencies are handled pretty wisely by RPM, but this itself depends on a good package builder in the first place. I have seen packages that depend upon themselves, for instance, and some that seem to depend on packages that will break other things. Keep this in mind.

Sometimes RPM will whine about a dependency which is installed but isn't registered. Perhaps you installed it not using an RPM for the package (ie OpenSSL). To get around this, you can force it to ignore dependencies:

        rpm -ivv --nodeps (package)

Note that this isn't always wise and should only be done when you know what you are getting youself into. This will rarely break installed stuff, but may mean the installed package wont work properly.

On rare occassion RPM will mess up and insist that you have a package installed when you don't. While this is usually a sign that something is amiss, it can be worked around. Just force the installation:

        rpm -ivv --force (package)

Beware. Just like when you ignored the dependencies, forcing a package to install may not be wise. Bear in mind that your machine could burst into flames or simply stop working. Caveat emptor and all that.

This probably wins the award for one of the coolest features in RPM: network installations. Sometimes, you don't have network clients on a system but you need to install them via RPM. RPM has built in FTP and web client sowftare to handle this:

        rpm -iv ftp://ftp.redhat.com/path/package.rpm
        rpm -iv http://www.me.com/path/package.rpm

I don't think it can do SSL HTTP connections, though. Debian's packages can do this, as can BSD packages. I don't think most commercial tools can do this, though.

Managing Your Packages

OK, you know how to install packages. But let's say you want to work with some packages before you, either installed or not. How can you do this? Well, simply put, you can use package management features in rpm to deal with packages, ones that are installed already or ones that are not, to look inside of them. This can include verifying packages, too.

When you get a new package, you may want to examine it to see what it offers. Using query mode, you can peek inside. To simply query a package and get some generic information about it, just simply:

        rpm -qp (package)

This should bring just the name of the package. Pretty boring, isnt it? A much more useful method is to get the package information from the package itself:

        rpm -qip (package)

This will bring up the author, build host and date, whether it's installed yet, etc, about a package. Also included is a summary about the package's functionality and features.

All of this is nice, but let's say you want to see what is really inside the package, what files are inside of it? Well, you can list the contents of a package, much like you would get the table of contents of a tar archive (using tar -tvf):

        rpm -qlp (package)

This will list all of the files within the archive, using their full pathnames. I use this often to see what will be installed with a package, but most importantly where. I like to stick to conventions about putting stuff in their expected places, but some packagers do not. Lastly, to show all of the packages you have installed on your system, use:

        rpm -qa

This will bring up a list of packages installed on the current system. It may be useful to sort them (by piping through sort, rpm -qa | sort). Use these names when uninstalling packages (below).

One of my favorite things about RPM is how it can verify packages. This is useful in detecting a compromised machine, or a binary that may be missing or modified due to some error on your part. To verify one package, just point rpm at it with the -V flag:

        rpm -V (package)

This should bring up a brief description of wether or not the package checks out. To verify all packages installed on a system, it is quite simply:

        rpm -Va

Verify mode brings up several statistics about a file. Their shorthand is as follows:
 
5 MD5 sum
S File size
L Symlink
T Mtime (modification time)
D Device
U User
G Group
M Mode (includes permissions and file type)

Sometimes they're meaningless, for example if you modify your /etc/inetd.conf file it will show up as a different size and MD5 sum. However, some things shouldn't change, like /bin/login. Hence, rpm -Va can be a useful quick security check, suggesting which things need more peering into.

One of the great things about package management we stated at the outset was the ease with which upgrading can be performed. RPM has two sometimes confusing options to upgrading packages. The first is a simple upgrade:

        rpm -U (package)

What is confusing about it is it's resulting action if the package is not yet installed. If it finds it, the package is upgraded. If it doesn't find it, you are upgraded to it, meaning the package is installed. This can be scary sometimes if you don't mean to install a package and an update comes out, which you blindly follow. Because of this, I suggest using "freshen" on packages you only want to ensure you have the latest version of:

        rpm -F (package)

This will only update installed packages, and not install the package if it is missing.

Upgrades are done in an interesting fashion, too. The new version is first installed and the differences with the old version are noted. Then the older version is removed, but only the unique portions of it so as not to remove the new portions. Imagine if /usr/local/bin/netscape were upgraded and then removed, that would defeat the whole effort!

Uninstalling an RPM

You can install, upgrade, and manage, and hence you can definitely uninstall using RPM. The flag to use is -e, and many of the same conditions for installation apply for removal. To silently uninstall an RPM package, use:

        rpm -e (package)

Note that, unlike for installations and upgrades, package here refers not to package-version.i386.rpm, but instead to package-version. These are the values reported in query mode, so use those. You should be able to get all components of a package for removal by specifying the most generic, common part of the name, ie for linuxconf and linuxconf-devel, specify linuxconf. Dependencies can also be avoided:

        rpm -e --nodeps (package)

The same caveats apply here, as well, you could wind up breaking more stuff than you anticipated. Verbosity flags can also be added, just like in installations.

Some Notes about RPMs

Sometimes maintainers build rather stupid dependencies for their RPMs. Case in point, libsafe. It has a wierd dependency: itself. As such, I usually find I have to install --nodeps to get it to install properly. Other times a package will contain extra junk, and you could wind up installing more than you had planned for.

My biggest complaint are RPMs that have a name that doesn't suit the function of the pieces. While this can be gotten around by digging around using the query tools as described above, it's more time than I care to waste. Name your RPMs well, I suggest.

Getting RPM

You can get RPM for any Linux or UNIX variant, as it is Open Source. RPM comes native in Red Hat Linux, and some derivatives. Versions 3.0 and above are recomended for compatability, some stupid stuff went on before that that 3.0 hopes to fix. Version 4.0 reportedly has a different database format, and so I reccomend checking around for how to get around this issue before you upgrade to 4.0. I'm not sure if can simply rebuild the database in 4.0 to remedy this.

RPM is normally distributed as an RPM of itself. Cute, eh? Luckily, it also comes in a gzipped tarball and also in source form. I have RPM installed on Slackware, for example, and could install it on IRIX or Solaris if I so desired. It's nearly useless on non-Linux platforms as rarely are packages built in RPM for other UNIX variants.

Coming Next Time

In the upcoming second half of this documentation we will taking a look at how to build an RPM for yourself. We'll look at the spec files, layout in /usr/src and the building flags. It's pretty easy once you learn a few basics.

Resources

The best place to start is the online website for RPM, http://www.rpm.org/. There you will find the book 'Maximum RPM', which covers how to use, build, and even programming with the RPM API. The RPM help page (rpm -h) is also quite useful once you learn a few basics. To find RPM archives, check http://www.rpmfind.net/, which maintains a great searchable database of packages for various distributions and versions on a variety of platforms. Very useful.

Jose Nazario

José is a Ph.D. student in the department of biochemistry at Case Western Reserve University in Cleveland, OH. He has been using UNIX for nearly ten years, and Linux since kernels 1.2.


Copyright © 2001, Jose Nazario.
Copying license http://www.linuxgazette.net/copying.html
Published in Issue 68 of Linux Gazette, July 2001

[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]