initCommon(); $template->displayHeader(); ?>

Chapter 24. RPM Package File Structure

This appendix covers:

This appendix describes the format of RPM package files. You can combine this information with C, Perl, or Python data structures to access the information. In all cases, you should access elements in an RPM file using one of the available programming libraries. Do not attempt to access the files directly, as you may inadvertently damage the RPM file.

Cross Reference

Chapters 16, 17, and 18 cover programming with C, Python, and Perl, respectively.

The RPM package format described here has been standardized as part of the Linux Standards Base, or LSB, version 1.3.

Cross Reference

The LSB 1.3 section on package file formats is available at www.linuxbase.org/spec/refspecs/LSB_1.3.0/gLSB/gLSB.html#PACKAGEFMT.

24.1. The Package File

RPM packages are delivered with one file per package. All RPM files have the following basic format of four sections:

*A lead or file identifier

*A signature

*Header information

*Archive of the payload, the files to install

All values are encoded in network byte order, for portability to multiple processor architectures.

24.1.1. The file identifier

Also called the lead or the rpmlead, the identifier marks that this file is an RPM file. It contains a magic number that the file command uses to detect RPM files. It also contains version and architecture information.

The start of the identifier is the so-called magic number. The file command reads the first few bytes of a file and compares the values found with the contents of /usr/share/magic (/etc/magic on many UNIX systems), a database of magic numbers. This allows the file command to quickly identify files.

The identifier includes the RPM version number, that is, the version of the RPM file format used for the package. The identifier also has a flag that tells the type of the RPM file, whether the file contains a binary or source package. An architecture flag allows RPM software to double-check that you are not trying to install a package for a non-compatible architecture.

24.1.2. The signature

The signature appears after the lead or identifier section. The RPM signature helps verify the integrity of the package, and optionally the authenticity.

The signature works by performing a mathematical function on the header and archive section of the file. The mathematical function can be an encryption process, such as PGP (Pretty Good Privacy), or a message digest in MD5 format.

24.1.3. The header

The identifier section no longer contains enough information to describe modern RPMs. Furthermore, the identifier section is nowhere near as flexible as today’s packages require. To counter these deficiencies, the header section was introduced to include more information about the package.

The header structure contains three parts:

*Header record

*One or more header index record structures

*Data for the index record structures

The header record identifies this as the RPM header. It also contains a count of the number of index records and the size of the index record data.

Each index record uses a structure that contains a tag number for the data it contains. This includes tag IDs for the copyright message, name of the package, version number, and so on. A type number identifies the type of the item. An offset indicates where in the data section the data for this header item begins. A count indicates how many items of the given type are in this header entry. You can multiply the count by the size of the type to get the number of bytes used for the header entry.

Table D-1 lists the type identifiers.

Table D-1 Header type identifiers

Constant

Value

Size in Bytes

RPM_NULL_TYPE

0

No size

RPM_CHAR_TYPE

1

1

RPM_INT8_TYPE

2

1

RPM_INT16_TYPE

3

2

RPM_INT32_TYPE

4

4

RPM_INT64_TYPE

5

Not supported yet

RPM_STRING_TYPE

6

Variable number of bytes, terminated by a NULL

RPM_BIN_TYPE

7

1

RPM_STRING_ARRAY_TYPE

8

Variable, vector of NULL-terminated strings

RPM_I18NSTRING_TYPE

9

Variable, vector of NULL-terminated strings

Note

Integer values are aligned on 2-byte (16-bit integers) or 4-byte (32-bit integers) boundaries.

24.1.3.1. Header Tags

Table D-2 lists the tag identifiers.

Table D-2 Header entry tag identifiers

Constant

Value

Type

Required?

RPMTAG_NAME

1000

STRING

Yes

RPMTAG_VERSION

1001

STRING

Yes

RPMTAG_RELEASE

1002

STRING

Yes

RPMTAG_SUMMARY

1004

I18NSTRING

Yes

RPMTAG_DESCRIPTION

1005

I18NSTRING

Yes

RPMTAG_BUILDTIME

1006

INT32

Optional

RPMTAG_BUILDHOST

1007

STRING

Optional

RPMTAG_SIZE

1009

INT32

Yes

RPMTAG_LICENSE

1014

STRING

Yes

RPMTAG_GROUP

1016

I18NSTRING

Yes

RPMTAG_OS

1021

STRING

Yes

RPMTAG_ARCH

1022

STRING

Yes

RPMTAG_SOURCERPM

1044

STRING

Optional

RPMTAG_FILEVERIFYFLAGS

1045

INT32

Optional

RPMTAG_ARCHIVESIZE

1046

INT32

Optional

RPMTAG_RPMVERSION

1064

STRING

Optional

RPMTAG_CHANGELOGTIME

1080

INT32

Optional

RPMTAG_CHANGELOGNAME

1081

STRING_ARRAY

Optional

RPMTAG_CHANGELOGTEXT

1082

STRING_ARRAY

Optional

RPMTAG_COOKIE

1094

STRING

Optional

RPMTAG_OPTFLAGS

1122

STRING

Optional

RPMTAG_PAYLOADFORMAT

1124

STRING

Yes

RPMTAG_PAYLOADCOMPRESSOR

1125

STRING

Yes

RPMTAG_PAYLOADFLAGS

1126

STRING

Yes

RPMTAG_RHNPLATFORM

1131

STRING

Deprecated

RPMTAG_PLATFORM

1132

STRING

Optional

Most of these tags are self-explanatory; however, a few tags hold special meaning. The RPMTAG_SIZE tag holds the size of all the regular files in the payload. The RPMTAG_ARCHIVESIZE tag holds the uncompressed size of the payload section, including the necessary cpio headers. The RPMTAG_COOKIE tag holds an opaque string.

According to the LSB standards, the RPMTAG_PAYLOADFORMAT must always be cpio. The RPMTAG_PAYLOADCOMPRESSOR must be gzip. The RPMTAG_PAYLOADFLAGS must always be 9.

The RPMTAG_OPTFLAGS tag holds special compiler flags used to build the package. The RPMTAG_PLATFORM and RPMTAG_RHNPLATFORM tags hold opaque strings.

24.1.3.2. Private Header Tags

Table D-3 lists header tags that are considered private.

Table D-3 Private header tags

Constant

Value

Type

Required?

RPMTAG_HEADERSIGNATURES

62

BIN

Optional

RPMTAG_HEADERIMMUTABLE

63

BIN

Optional

RPMTAG_HEADERI18NTABLE

100

STRING_ARRAY

Yes

The RPMTAG_HEADERSIGNATURES tag indicates that this is a signature entry. The RPMTAG_HEADERIMMUTABLE tag indicates a header item that is used in the calculation of signatures. This data should be preserved.

The RPMTAG_HEADERI18NTABLE tag holds a table of locales used for international text lookup.

24.1.3.3. Signature Tags

The signature section is implemented as a header structure, but it is not considered part of the RPM header. Table D-4 lists special signature-related tags.

Table D-4 Signature-related tags

Constant

Value

Type

Required?

SIGTAG_SIGSIZE

1000

INT32

Yes

SIGTAG_PGP

1002

BIN

Optional

SIGTAG_MD5

1004

BIN

Yes

SIGTAG_GPG

1005

BIN

Optional

SIGTAG_PAYLOADSIZE

1007

INT32

Optional

SIGTAG_SHA1HEADER

1010

STRING

Optional

SIGTAG_DSAHEADER

1011

BIN

Optional

SIGTAG_RSAHEADER

1012

BIN

Optional

The SIGTAG_SIGSIZE tag specifies the size of the header and payload sections, while the SIGTAG_PAYLOADSIZE holds the uncompressed size of the payload.

To verify the integrity of the package, the SIGTAG_MD5 tag holds a 128-bit MD5 checksum of the header and payload sections. The SIGTAG_SHA1HEADER holds an SHA1 checksum of the entire header section.

To verify the authenticity of the package, the SIGTAG_PGP tag holds a Version 3 OpenPGP Signature Packet RSA signature of the header and payload areas. The SIGTAG_GPG tag holds a Version 3 OpenPGP Signature Packet DSA signature of the header and payload areas. The SIGTAG_DSAHEADER holds a DSA signature of just the header section. If the SIGTAG_DSAHEADER tag is included, the SIGTAG_GPG tag must also be present. The SIGTAG_ RSAHEADER holds an RSA signature of just the header section. If the SIGTAG_ RSAHEADER tag is included, the SIGTAG_PGP tag must also be present.

24.1.3.4. Installation Tags

A set of installation-specific tags tells the rpm program how to run the pre- and post-installation scripts. Table D-5 lists these tags.

Table D-5 Installation tags

Constant

Value

Type

Required?

RPMTAG_PREINPROG

1085

STRING

Optional

RPMTAG_POSTINPROG

1086

STRING

Optional

RPMTAG_PREUNPROG

1087

STRING

Optional

RPMTAG_POSTUNPROG

1088

STRING

Optional

The RPMTAG_PREINPROG tag holds the name of the interpreter, such as sh, to run the pre-install script. Similarly, the RPMTAG_POSTINPROG tag holds the name of the interpreter to run the post-install script. RPMTAG_PREUNPROG and RPMTAG_POSTUNPROG are the same for the uninstall scripts.

24.1.3.5. File Information Tags

File information tags are placed in the header for convenient access. These tags describe the files in the payload. Table D-6 lists these tags.

Table D-6 File information tags

Constant

Value

Type

Required?

RPMTAG_OLDFILENAMES

1027

STRING_ARRAY

Optional

RPMTAG_FILESIZES

1028

INT32

Yes

RPMTAG_FILEMODES

1030

INT16

Yes

RPMTAG_FILERDEVS

1033

INT16

Yes

RPMTAG_FILEMTIMES

1034

INT32

Yes

RPMTAG_FILEMD5S

1035

STRING_ARRAY

Yes

RPMTAG_FILELINKTOS

1036

STRING_ARRAY

Yes

RPMTAG_FILEFLAGS

1037

INT32

Yes

RPMTAG_FILEUSERNAME

1039

STRING_ARRAY

Yes

RPMTAG_FILEGROUPNAME

1040

STRING_ARRAY

Yes

RPMTAG_FILEDEVICES

1095

INT32

Yes

RPMTAG_FILEINODES

1096

INT32

Yes

RPMTAG_FILELANGS

1097

STRING_ARRAY

Yes

RPMTAG_DIRINDEXES

1116

INT32

Optional

RPMTAG_BASENAMES

1117

STRING_ARRAY

Optional

RPMTAG_DIRNAMES

1118

STRING_ARRAY

Optional

The RPMTAG_OLDFILENAMES tag is used when the files are not compressed, when the RPMTAG_REQUIRENAME tag does not indicate rpmlib(CompressedFileNames). The RPMTAG_FILESIZES tag specifies the size of each file in the payload, while the RPMTAG_FILEMODES tag specifies the file modes (permissions) and the RPMTAG_FILEMTIMES tag holds the last modification time for each file.

The RPMTAG_BASENAMES tag holds an array of the base file names for the files in the payload. The RPMTAG_DIRNAMES tag holds an array of the directories for the files. The RPMTAG_DIRINDEXES tag contains an index into the RPMTAG_DIRNAMES for the directory. Each RPM must have either RPMTAG_OLDFILENAMES or the triple of RPMTAG_BASENAMES, RPMTAG_DIRNAMES, and RPMTAG_DIRINDEXES, but not both.

24.1.3.6. Dependency Tags

The dependency tags provide one of the most useful features of the RPM system by allowing for automated dependency checks between packages. Table D-7 lists these tags.

Table D-7 Dependency tags

Constant

Value

Type

Required?

RPMTAG_PROVIDENAME

1047

STRING_ARRAY

Yes

RPMTAG_REQUIREFLAGS

1048

INT32

Yes

RPMTAG_REQUIRENAME

1049

STRING_ARRAY

Yes

RPMTAG_REQUIREVERSION

1050

STRING_ARRAY

Yes

RPMTAG_CONFLICTFLAGS

1053

INT32

Optional

RPMTAG_CONFLICTNAME

1054

STRING_ARRAY

Optional

RPMTAG_CONFLICTVERSION

1055

STRING_ARRAY

Optional

RPMTAG_OBSOLETENAME

1090

STRING_ARRAY

Optional

RPMTAG_PROVIDEFLAGS

1112

INT32

Yes

RPMTAG_PROVIDEVERSION

1113

STRING_ARRAY

Yes

RPMTAG_OBSOLETEFLAGS

1114

INT32

Optional

RPMTAG_OBSOLETEVERSION

1115

INT32

Optional

Each of these tags comes in triples, which are formatted similarly. The RPMTAG_REQUIRENAME tag holds an array of required capabilities. The RPMTAG_REQUIREVERSION tag holds an array of the versions of the required capabilities. The RPMTAG_REQUIREFLAGS tag ties the two together with a set of bit flags that specify whether the requirement is for a version less than the given number, equal to the given number, greater than or equal to the given number, and so on. Table D-8 lists these flags.

Table D-8 Bit flags for dependencies

Flag

Value

RPMSENSE_LESS

0x02

RPMSENSE_GREATER

0x04

RPMSENSE_EQUAL

0x08

RPMSENSE_PREREQ

0x40

RPMSENSE_INTERP

0x100

RPMSENSE_SCRIPT_PRE

0x200

RPMSENSE_SCRIPT_POST

0x400

RPMSENSE_SCRIPT_PREUN

0x800

RPMSENSE_SCRIPT_POSTUN

0x1000

The RPMTAG_PROVIDENAME, RPMTAG_PROVIDEVERSION, and RPMTAG_PROVIDEFLAGS tags work similarly for the capabilities this package provides. The RPMTAG_CONFLICTNAME, RPMTAG_CONFLICTVERSION, and RPMTAG_CONFLICTFLAGS tags specify the conflicts. The RPMTAG_OBSOLETENAME, RPMTAG_OBSOLETEVERSION, and RPMTAG_OBSOLETEFLAGS tags specify the obsoleted dependencies.

In addition, an RPM package can define some special requirements in the RPMTAG_REQUIRENAME and RPMTAG_REQUIREVERSION tags. Table D-9 lists these requirements.

Table D-9 Special package requirement names and versions

Name

Version

Specifies

Lsb

1.3

The package conforms to the Linux Standards Base RPM format.

rpmlib(VersionedDependencies)

3.0.3-1

The package holds dependencies or prerequisites that have versions associated with them.

rpmlib(PayloadFilesHavePrefix)

4.0-1

File names in the archive have a “.” prepended on the names.

rpmlib(CompressedFileNames)

3.0.4-1

The package uses the RPMTAG_DIRINDEXES, RPMTAG_DIRNAME and RPMTAG_BASENAMES tags for specifying file names.

/bin/sh

NA

Indicates a requirement for the Bourne shell to run the installation scripts.

24.1.4. The payload

The payload, or archive, section contains the actual files used in the package. These are the files that the rpm command installs when you install the package. To save space, data in the archive section is compressed in GNU gzip format.

Once uncompressed, the data is in cpio format, which is how the rpm2cpio command can do its work. In cpio format, the payload is made up of records, one per file. Table D-10 lists the record structure.

Table D-10 cpio file record structure

Element

Holds

cpio header

Information on the file, such as the file mode (permissions)

File name

NULL-terminated string

Padding

0 to 3 bytes, as needed, to align the next element on a 4-byte boundary

File data

The contents of the file

Padding

0 to 3 bytes, as needed, to align the next file record on a 4-byte boundary

The information in the cpio header duplicates that of the RPM file-information header elements.

displayFooter('$Date: 2005/11/02 19:30:06 $'); ?>