include("site.inc"); $template = new Page; $template->initCommon(); $template->displayHeader(); ?>
This appendix covers:
RPM package file structure
RPM header entry formats
Payload format
This appendix describes the format of RPM package files. You can combine this information with C, Perl, or Python data structures to access the information. In all cases, you should access elements in an RPM file using one of the available programming libraries. Do not attempt to access the files directly, as you may inadvertently damage the RPM file.
Cross Reference
Chapters 16, 17, and 18 cover programming with C, Python, and Perl, respectively.
The RPM package format described here has been standardized as part of the Linux Standards Base, or LSB, version 1.3.
Cross Reference
The LSB 1.3 section on package file formats is available at www.linuxbase.org/spec/refspecs/LSB_1.3.0/gLSB/gLSB.html#PACKAGEFMT.
RPM packages are delivered with one file per package. All RPM files have the following basic format of four sections:
*A lead or file identifier
*A signature
*Header information
*Archive of the payload, the files to install
All values are encoded in network byte order, for portability to multiple processor architectures.
Also called the lead or the rpmlead, the identifier marks that this file is an RPM file. It contains a magic number that the file command uses to detect RPM files. It also contains version and architecture information.
The start of the identifier is the so-called magic number. The file command reads the first few bytes of a file and compares the values found with the contents of /usr/share/magic (/etc/magic on many UNIX systems), a database of magic numbers. This allows the file command to quickly identify files.
The identifier includes the RPM version number, that is, the version of the RPM file format used for the package. The identifier also has a flag that tells the type of the RPM file, whether the file contains a binary or source package. An architecture flag allows RPM software to double-check that you are not trying to install a package for a non-compatible architecture.
The signature appears after the lead or identifier section. The RPM signature helps verify the integrity of the package, and optionally the authenticity.
The signature works by performing a mathematical function on the header and archive section of the file. The mathematical function can be an encryption process, such as PGP (Pretty Good Privacy), or a message digest in MD5 format.
The identifier section no longer contains enough information to describe modern RPMs. Furthermore, the identifier section is nowhere near as flexible as today’s packages require. To counter these deficiencies, the header section was introduced to include more information about the package.
The header structure contains three parts:
*Header record
*One or more header index record structures
*Data for the index record structures
The header record identifies this as the RPM header. It also contains a count of the number of index records and the size of the index record data.
Each index record uses a structure that contains a tag number for the data it contains. This includes tag IDs for the copyright message, name of the package, version number, and so on. A type number identifies the type of the item. An offset indicates where in the data section the data for this header item begins. A count indicates how many items of the given type are in this header entry. You can multiply the count by the size of the type to get the number of bytes used for the header entry.
Table D-1 lists the type identifiers.
Table D-1 Header type identifiers
Constant | Value | Size in Bytes |
RPM_NULL_TYPE | 0 | No size |
RPM_CHAR_TYPE | 1 | 1 |
RPM_INT8_TYPE | 2 | 1 |
RPM_INT16_TYPE | 3 | 2 |
RPM_INT32_TYPE | 4 | 4 |
RPM_INT64_TYPE | 5 | Not supported yet |
RPM_STRING_TYPE | 6 | Variable number of bytes, terminated by a NULL |
RPM_BIN_TYPE | 7 | 1 |
RPM_STRING_ARRAY_TYPE | 8 | Variable, vector of NULL-terminated strings |
RPM_I18NSTRING_TYPE | 9 | Variable, vector of NULL-terminated strings |
Note
Integer values are aligned on 2-byte (16-bit integers) or 4-byte (32-bit integers) boundaries.
Table D-2 lists the tag identifiers.
Table D-2 Header entry tag identifiers
Constant | Value | Type | Required? |
RPMTAG_NAME | 1000 | STRING | Yes |
RPMTAG_VERSION | 1001 | STRING | Yes |
RPMTAG_RELEASE | 1002 | STRING | Yes |
RPMTAG_SUMMARY | 1004 | I18NSTRING | Yes |
RPMTAG_DESCRIPTION | 1005 | I18NSTRING | Yes |
RPMTAG_BUILDTIME | 1006 | INT32 | Optional |
RPMTAG_BUILDHOST | 1007 | STRING | Optional |
RPMTAG_SIZE | 1009 | INT32 | Yes |
RPMTAG_LICENSE | 1014 | STRING | Yes |
RPMTAG_GROUP | 1016 | I18NSTRING | Yes |
RPMTAG_OS | 1021 | STRING | Yes |
RPMTAG_ARCH | 1022 | STRING | Yes |
RPMTAG_SOURCERPM | 1044 | STRING | Optional |
RPMTAG_FILEVERIFYFLAGS | 1045 | INT32 | Optional |
RPMTAG_ARCHIVESIZE | 1046 | INT32 | Optional |
RPMTAG_RPMVERSION | 1064 | STRING | Optional |
RPMTAG_CHANGELOGTIME | 1080 | INT32 | Optional |
RPMTAG_CHANGELOGNAME | 1081 | STRING_ARRAY | Optional |
RPMTAG_CHANGELOGTEXT | 1082 | STRING_ARRAY | Optional |
RPMTAG_COOKIE | 1094 | STRING | Optional |
RPMTAG_OPTFLAGS | 1122 | STRING | Optional |
RPMTAG_PAYLOADFORMAT | 1124 | STRING | Yes |
RPMTAG_PAYLOADCOMPRESSOR | 1125 | STRING | Yes |
RPMTAG_PAYLOADFLAGS | 1126 | STRING | Yes |
RPMTAG_RHNPLATFORM | 1131 | STRING | Deprecated |
RPMTAG_PLATFORM | 1132 | STRING | Optional |
Most of these tags are self-explanatory; however, a few tags hold special meaning. The RPMTAG_SIZE tag holds the size of all the regular files in the payload. The RPMTAG_ARCHIVESIZE tag holds the uncompressed size of the payload section, including the necessary cpio headers. The RPMTAG_COOKIE tag holds an opaque string.
According to the LSB standards, the RPMTAG_PAYLOADFORMAT must always be cpio. The RPMTAG_PAYLOADCOMPRESSOR must be gzip. The RPMTAG_PAYLOADFLAGS must always be 9.
The RPMTAG_OPTFLAGS tag holds special compiler flags used to build the package. The RPMTAG_PLATFORM and RPMTAG_RHNPLATFORM tags hold opaque strings.
Table D-3 lists header tags that are considered private.
Table D-3 Private header tags
Constant | Value | Type | Required? |
RPMTAG_HEADERSIGNATURES | 62 | BIN | Optional |
RPMTAG_HEADERIMMUTABLE | 63 | BIN | Optional |
RPMTAG_HEADERI18NTABLE | 100 | STRING_ARRAY | Yes |
The RPMTAG_HEADERSIGNATURES tag indicates that this is a signature entry. The RPMTAG_HEADERIMMUTABLE tag indicates a header item that is used in the calculation of signatures. This data should be preserved.
The RPMTAG_HEADERI18NTABLE tag holds a table of locales used for international text lookup.
The signature section is implemented as a header structure, but it is not considered part of the RPM header. Table D-4 lists special signature-related tags.
Table D-4 Signature-related tags
Constant | Value | Type | Required? |
SIGTAG_SIGSIZE | 1000 | INT32 | Yes |
SIGTAG_PGP | 1002 | BIN | Optional |
SIGTAG_MD5 | 1004 | BIN | Yes |
SIGTAG_GPG | 1005 | BIN | Optional |
SIGTAG_PAYLOADSIZE | 1007 | INT32 | Optional |
SIGTAG_SHA1HEADER | 1010 | STRING | Optional |
SIGTAG_DSAHEADER | 1011 | BIN | Optional |
SIGTAG_RSAHEADER | 1012 | BIN | Optional |
The SIGTAG_SIGSIZE tag specifies the size of the header and payload sections, while the SIGTAG_PAYLOADSIZE holds the uncompressed size of the payload.
To verify the integrity of the package, the SIGTAG_MD5 tag holds a 128-bit MD5 checksum of the header and payload sections. The SIGTAG_SHA1HEADER holds an SHA1 checksum of the entire header section.
To verify the authenticity of the package, the SIGTAG_PGP tag holds a Version 3 OpenPGP Signature Packet RSA signature of the header and payload areas. The SIGTAG_GPG tag holds a Version 3 OpenPGP Signature Packet DSA signature of the header and payload areas. The SIGTAG_DSAHEADER holds a DSA signature of just the header section. If the SIGTAG_DSAHEADER tag is included, the SIGTAG_GPG tag must also be present. The SIGTAG_ RSAHEADER holds an RSA signature of just the header section. If the SIGTAG_ RSAHEADER tag is included, the SIGTAG_PGP tag must also be present.
A set of installation-specific tags tells the rpm program how to run the pre- and post-installation scripts. Table D-5 lists these tags.
Table D-5 Installation tags
Constant | Value | Type | Required? |
RPMTAG_PREINPROG | 1085 | STRING | Optional |
RPMTAG_POSTINPROG | 1086 | STRING | Optional |
RPMTAG_PREUNPROG | 1087 | STRING | Optional |
RPMTAG_POSTUNPROG | 1088 | STRING | Optional |
The RPMTAG_PREINPROG tag holds the name of the interpreter, such as sh, to run the pre-install script. Similarly, the RPMTAG_POSTINPROG tag holds the name of the interpreter to run the post-install script. RPMTAG_PREUNPROG and RPMTAG_POSTUNPROG are the same for the uninstall scripts.
File information tags are placed in the header for convenient access. These tags describe the files in the payload. Table D-6 lists these tags.
Table D-6 File information tags
Constant | Value | Type | Required? |
RPMTAG_OLDFILENAMES | 1027 | STRING_ARRAY | Optional |
RPMTAG_FILESIZES | 1028 | INT32 | Yes |
RPMTAG_FILEMODES | 1030 | INT16 | Yes |
RPMTAG_FILERDEVS | 1033 | INT16 | Yes |
RPMTAG_FILEMTIMES | 1034 | INT32 | Yes |
RPMTAG_FILEMD5S | 1035 | STRING_ARRAY | Yes |
RPMTAG_FILELINKTOS | 1036 | STRING_ARRAY | Yes |
RPMTAG_FILEFLAGS | 1037 | INT32 | Yes |
RPMTAG_FILEUSERNAME | 1039 | STRING_ARRAY | Yes |
RPMTAG_FILEGROUPNAME | 1040 | STRING_ARRAY | Yes |
RPMTAG_FILEDEVICES | 1095 | INT32 | Yes |
RPMTAG_FILEINODES | 1096 | INT32 | Yes |
RPMTAG_FILELANGS | 1097 | STRING_ARRAY | Yes |
RPMTAG_DIRINDEXES | 1116 | INT32 | Optional |
RPMTAG_BASENAMES | 1117 | STRING_ARRAY | Optional |
RPMTAG_DIRNAMES | 1118 | STRING_ARRAY | Optional |
The RPMTAG_OLDFILENAMES tag is used when the files are not compressed, when the RPMTAG_REQUIRENAME tag does not indicate rpmlib(CompressedFileNames). The RPMTAG_FILESIZES tag specifies the size of each file in the payload, while the RPMTAG_FILEMODES tag specifies the file modes (permissions) and the RPMTAG_FILEMTIMES tag holds the last modification time for each file.
The RPMTAG_BASENAMES tag holds an array of the base file names for the files in the payload. The RPMTAG_DIRNAMES tag holds an array of the directories for the files. The RPMTAG_DIRINDEXES tag contains an index into the RPMTAG_DIRNAMES for the directory. Each RPM must have either RPMTAG_OLDFILENAMES or the triple of RPMTAG_BASENAMES, RPMTAG_DIRNAMES, and RPMTAG_DIRINDEXES, but not both.
The dependency tags provide one of the most useful features of the RPM system by allowing for automated dependency checks between packages. Table D-7 lists these tags.
Table D-7 Dependency tags
Constant | Value | Type | Required? |
RPMTAG_PROVIDENAME | 1047 | STRING_ARRAY | Yes |
RPMTAG_REQUIREFLAGS | 1048 | INT32 | Yes |
RPMTAG_REQUIRENAME | 1049 | STRING_ARRAY | Yes |
RPMTAG_REQUIREVERSION | 1050 | STRING_ARRAY | Yes |
RPMTAG_CONFLICTFLAGS | 1053 | INT32 | Optional |
RPMTAG_CONFLICTNAME | 1054 | STRING_ARRAY | Optional |
RPMTAG_CONFLICTVERSION | 1055 | STRING_ARRAY | Optional |
RPMTAG_OBSOLETENAME | 1090 | STRING_ARRAY | Optional |
RPMTAG_PROVIDEFLAGS | 1112 | INT32 | Yes |
RPMTAG_PROVIDEVERSION | 1113 | STRING_ARRAY | Yes |
RPMTAG_OBSOLETEFLAGS | 1114 | INT32 | Optional |
RPMTAG_OBSOLETEVERSION | 1115 | INT32 | Optional |
Each of these tags comes in triples, which are formatted similarly. The RPMTAG_REQUIRENAME tag holds an array of required capabilities. The RPMTAG_REQUIREVERSION tag holds an array of the versions of the required capabilities. The RPMTAG_REQUIREFLAGS tag ties the two together with a set of bit flags that specify whether the requirement is for a version less than the given number, equal to the given number, greater than or equal to the given number, and so on. Table D-8 lists these flags.
Table D-8 Bit flags for dependencies
Flag | Value |
RPMSENSE_LESS | 0x02 |
RPMSENSE_GREATER | 0x04 |
RPMSENSE_EQUAL | 0x08 |
RPMSENSE_PREREQ | 0x40 |
RPMSENSE_INTERP | 0x100 |
RPMSENSE_SCRIPT_PRE | 0x200 |
RPMSENSE_SCRIPT_POST | 0x400 |
RPMSENSE_SCRIPT_PREUN | 0x800 |
RPMSENSE_SCRIPT_POSTUN | 0x1000 |
The RPMTAG_PROVIDENAME, RPMTAG_PROVIDEVERSION, and RPMTAG_PROVIDEFLAGS tags work similarly for the capabilities this package provides. The RPMTAG_CONFLICTNAME, RPMTAG_CONFLICTVERSION, and RPMTAG_CONFLICTFLAGS tags specify the conflicts. The RPMTAG_OBSOLETENAME, RPMTAG_OBSOLETEVERSION, and RPMTAG_OBSOLETEFLAGS tags specify the obsoleted dependencies.
In addition, an RPM package can define some special requirements in the RPMTAG_REQUIRENAME and RPMTAG_REQUIREVERSION tags. Table D-9 lists these requirements.
Table D-9 Special package requirement names and versions
Name | Version | Specifies |
Lsb | 1.3 | The package conforms to the Linux Standards Base RPM format. |
rpmlib(VersionedDependencies) | 3.0.3-1 | The package holds dependencies or prerequisites that have versions associated with them. |
rpmlib(PayloadFilesHavePrefix) | 4.0-1 | File names in the archive have a “.” prepended on the names. |
rpmlib(CompressedFileNames) | 3.0.4-1 | The package uses the RPMTAG_DIRINDEXES, RPMTAG_DIRNAME and RPMTAG_BASENAMES tags for specifying file names. |
/bin/sh | NA | Indicates a requirement for the Bourne shell to run the installation scripts. |
The payload, or archive, section contains the actual files used in the package. These are the files that the rpm command installs when you install the package. To save space, data in the archive section is compressed in GNU gzip format.
Once uncompressed, the data is in cpio format, which is how the rpm2cpio command can do its work. In cpio format, the payload is made up of records, one per file. Table D-10 lists the record structure.
Table D-10 cpio file record structure
Element | Holds |
cpio header | Information on the file, such as the file mode (permissions) |
File name | NULL-terminated string |
Padding | 0 to 3 bytes, as needed, to align the next element on a 4-byte boundary |
File data | The contents of the file |
Padding | 0 to 3 bytes, as needed, to align the next file record on a 4-byte boundary |
The information in the cpio header duplicates that of the RPM file-information header elements.