include("site.inc"); $template = new Page; $template->initCommon(); $template->displayHeader(); ?>
The Fedora distribution, which is the collection of all Fedora-related files, uses the directory tree in Example 1, “Fedora directory tree”. It may include multiple versions of Fedora Core. The tree design makes it easier to "trim" unnecessary or undesired files. When you set up a mirror, duplicate this tree exactly, or as closely as possible. If you duplicate the tree, it will be easier to automate nightly updates.
fedora
+-- linux
+-- core
|-- 1
| ...
+-- 5
| +-- SRPMS
| +-- i386
| | +-- debug
| | +-- iso
| | +-- os
| | +-- Fedora
| | +-- SRPMS
| | +-- images
| | +-- isolinux
| +-- x86_64
+-- development
| ...
+-- test
| ...
+-- updates
+-- 1
| ...
+-- 5
| +-- SRPMS
| +-- i386
| +-- x86_64
+-- testing
+-- 1
| ...
+-- 5
+-- SRPMS
+-- i386
+-- x86_64
Example 1. Fedora directory tree
Naming conventions | |
---|---|
Throughout the rest of the document,
|
The
fedora/linux/core/5/
directory contains a copy of all the original distribution files
for Fedora Core 5. They are the same files found on the DVD and
CD-ROM version of the distribution. The
arch
/osFedora
subfolder contains all the files that
are necessary for installation, including the entire collection
of Fedora Core RPM packages. The images
folder
contains copies of any floppy diskette or CD-ROM images that
boot a system into installation or rescue modes. The
fedora/linux/core/5/
folder contains images of the CD-ROM version of the
distribution.
arch
/iso
RPM packages | |
---|---|
RPM, originally the Red Hat Package Manager and now the RPM Package Manager, is not just a file format. RPM is also a system that tracks and interconnects software and version information. The RPM system is quite popular, and many other Linux distributions use RPM as well. Read more information on RPM at http://www.rpm.org/. |
The SRPMS
folders under
architecture-specific branches are links that point to the main
SRPMS
folder for that distribution. For
example, fedora/linux/core/2/i386/os/SRPMS
is a link that points to
fedora/linux/core/2/SRPMS
.
A Fedora mirror consists of at least the original ISO images or the distribution files. If possible, include both, provided you have sufficient disk space and/or bandwidth.
If you already have reliable CD-ROM installation discs of a
distribution, reduce your initial bandwidth and time spent
mirroring by copying the files from the discs to your server.
Copy all files from Installation Disc 1 into the
fedora/linux/core/5/
folder. Then copy all files from the arch
/osFedora
folder of each of the remaining Installation discs into the
fedora/linux/core/5/
folder on the server.
arch
/os/Fedora
Copy all the files from the SRPMS
folder on
each of the "Sources" discs to the
fedora/linux/core/5/SRPMS
folder on
the server. Make a link in the os
folder
that occurs under each architecture. Follow this example:
cd /var/www/mirror/fedora/linux/core/5/i386/os/Fedora
ln ../../SRPMS SRPMS
The documentation for anaconda, the Fedora Core installation program, calls this directory structure an exploded tree. This is because the package data on each CD is extracted, or exploded, to a large directory tree with a predetermined structure. The anaconda installer expects this structure to some extent.
If you only include CD images, create a mirror
suitable for installation services by mounting each CD image under the
directory. Make
a directory for each disc, naming them arch
/os/disc1
,
disc2
, and so on. Mount each disc on the
appropriate folder, and add entries to /etc/fstab
to perform this mount automatically in case of a reboot. Each entry
looks like this:
/path
/i386/iso/FC5-i386-disc1.iso /path
/i386/os/disc1 iso9660 defaults 0 0
The anaconda installer application automatically detects these folders and uses them properly. In addition, system configuration tools such as system-config-packages also continue to work properly when pointed at the parent of the ISO image mount points.
There are drawbacks to using CD ISO images in this fashion. For instance, no one directory contains the entire distribution of RPM packages. Soft links circumvent this problem, but your server security policies may not permit them. Fedora Core also comes in a ISO format DVD image, which alleviates this problem. Users who do not have DVD burning hardware, however, cannot use this image to make discs for their own use.
You only need a single line in /etc/fstab
for mounting the Fedora Core DVD ISO image. The entry looks like this:
/path
/i386/iso/FC5-i386-DVD.iso /path
/i386/os iso9660 defaults 0 0
You may omit almost any branch of the tree that you do not plan to use. Consider carefully the impact of excluding that folder. Branches you might trim from your mirror include:
Before you exclude an old version, ensure this does not adversely affect any of your users. These adverse affects can come in many forms. For example, the level of support for certain hardware sometimes changes between releases of Fedora Core. Users who cannot install a previous version may not be able to use Fedora Core. Your users might need to perform software-related tasks such as building packages for different Fedora Core releases. Always remain aware of the needs of your users during the planning stage.
If you do not have any x86-64 hosts to support, trimming these folders eliminates several gigabytes of extra files. If you support x86-64 hosts later, though, you must restore mirroring of these branches.
development
folder (formerly
"Rawhide").This folder contains all the latest "bleeding-edge" packages from the Fedora Project. If you participate in active Fedora development, you should not trim this branch. Fedora development moves at a rapid pace and requires frequent updates to the latest development package versions. However, the frequent updates cause your mirror to download significant amounts of material during the regular update cycle.
testing
folders.
These branches contain updates that are being subjected to
quality assurance through public testing, as well as the
test or "pre-release" versions of the Fedora Core
distribution. The testing
folder
under the main core
tree is where
test versions of the distribution, such as Fedora Core
6 test2, are kept. (Users of Fedora Core test distributions
are often directed to use the
development
branch to update
packages.) The testing
folder, under
updates
, contains package updates
that have not yet passed the public testing phase.
debug
folders.These folders contain packages that enable developers and skilled users to interpret data created when a program crashes or encounters a bug. If you participate actively in Fedora development, you should not trim these folders. If you trim this branch, you may still download individual packages as needed from a nearby public mirror site.
SRPMS
folders (and links
thereto).These folders contain the original source for all the binary RPM packages in the distribution. You may download these packages individually as needed to save space on your local mirror.
Unless your site closely manages workstation configuration, you
should probably not trim any of the updates
branches for the distributions you support. These locations
contain packages with bug fixes, security patches, and errata
updates that your users probably want.
Locate a public mirror site for Fedora Core by referring to the main project site's mirror page, http://fedora.redhat.com/Download/mirrors.html. Once you have selected a nearby mirror site, note what services it offers (FTP, HTTP, and/or rsync). A mirror is usually servicing a large number of users. Choose off-peak hours, when possible, to download a large set of files. Be aware of any timezone differences when estimating off-peak hours.
To download via HTTP or FTP, use either the
wget
or lftp
command. The wget
command recurses
subdirectories automatically and pulls down entire trees of
data with a single command. If you are not careful, however,
it is possible to pull down much more data than you
intended. The following commands mirror the entire current
Fedora Core distribution:
cd /var/www/mirror
wget --mirror -np -nH --cut-dirs=2
http://mirror.example.com/pub/mirror/fedora/linux/core/5/
Note the options used above:
--mirror
turns on recursion (descends
into all subdirectories), and duplicates file timestamps;
-np
prevents wget
from ascending into the parent directory;
-nH
prevents wget
from writing a directory named after the host (in this
case,
);
mirror.example.com
--cut-dirs=
truncates the first n
n
directories in the path. In the example above,
--cut-dirs=2
prevents
wget
from writing the
portion of the path into your mirror.
/pub/mirror
The same syntax works for both HTTP and FTP upstream
mirrors. It is possible that you may download some extraneous
files if the HTTP site formats its pages for browser
viewing. These files can be safely deleted, but return each
time the mirror updates unless you exclude them using special
options. See the wget
man pages for more
information.
The lftp
command works like the
wget
command, and mirrors the content of a
HTTP or FTP server. The wget
command,
however, does not delete old files locally. This feature is
important for update repository mirrors to stay synchronized
to upstream mirrors. New files are created and old files are
automatically removed from the upstream mirrors on a frequent
basis.
The lftp
command synchronizes files and
directories from a remote host like rsync
,
but uses HTTP or FTP protocols. Use the following command to
mirror the entire Fedora Core distribution with
lftp
:
cd /var/www/mirror && \
lftp -c "open http://mirror.example.com/pub/mirror/linux/core/5/i386/ && \
mirror --delete --verbose"
The -c
parameter executes a set of commands
in a lftp
process. Commands are separated
with &&
to prevent the
lftp
command from executing if the
cd
command fails. The commands in the
lftp
command set work the same way. The
command syntax A && B
is often
shorthand for "if A returns success, run B." An explanation
of the lftp
commands follows:
open
connects to the site and changes
directory automatically.
mirror
fetches all files and
directories recursively in the current directory. The
--delete
option excludes all local
files that are not in the remote directory. The
--verbose
option prints some
information in the screen and is optional.
The lftp
command above mantains an exact
copy of the directory for you. It downloads only new or
changed files, and deletes only those that no longer exist on
the upstream mirror.
As with wget
, it is possible you may
download some unwanted files. The lftp
command supports regular expressions for excluding files
within a mirror
command. The command below
shows how to mirror an current Fedora Core distribution updates
repository, excluding debug
and
repodata
directories:
cd /var/www/mirror && \
lftp -c "set mirror:exclude-regex 'debug\/|repodata\/' && \
open http://mirror.example.com/pub/mirror/linux/core/updates/5/i386/ && \
mirror --delete --verbose"
Consult the lftp
man pages for more
details and usage options.
Using Proxy for HTTP or FTP retrieval | |
---|---|
If you are behind a proxy or firewall, you may need to use a
HTTP proxy to mirror files. To do this, export the
environment variables
|
Use the rsync
command to synchronize a set
of files and/or directories with a remote host. It operates in
much the same way as rcp
, but it is usually
faster. One reason for the speed is that
rsync
has a special protocol that evaluates
and skips files (or portions of files) that are already
downloaded.
Begin by identifying the modules available on the upstream
mirror site you have chosen. Note that the double colon "::"
is always used after the host name to separate it from the
rest of the rsync
path. The following
command generates a list of "modules" on the upstream mirror.
rsync mirror.example.org::
These modules are roughly equivalent to top-level directories,
and they follow the same rules. To list any subdirectory of
the upstream mirror, add the directory path to the command
above. For example, on many mirrors, the
fedora-linux-core
module is equivalent to
the fedora/linux/core
path found at the
Fedora Project main download server. To list the contents of the Fedora Core
5 distribution folder on the upstream server, issue the
following command. Do not forget the trailing slash "/".
Without it, you only receive a listing of a folder name that
matches the last component of the remote path.
rsync mirror.example.org::fedora-linux-core/5/
To download via rsync
, add a destination
path on your system to the end of the command line. The
resulting tree of files from the listing you perform are
downloaded to the local path you specify. Remember, if you
leave off the trailing slash on the remote path, then the last
component of that path is created as a folder, and its
contents are copied.
rsync filehouse.example.org::files/misc/ /var/www/misc/
When downloading using rsync
for mirror purposes,
use some of the command line switches to improve performance and
feedback. The switches -PHav
enable the following
rsync
features:
recover partially-downloaded files, and show a progress meter
preserve hard links
recurse all directories, and preserve as much file information as possible, including timestamps, ownership, permissions, device files (if you are running as root), and soft links
give verbose feedback to the screen
Remove the -v
switch if you run this mirroring
process as part of a script, or have no need to monitor progress. The
following example mirrors all available versions of Fedora Core from an
upstream site.
Example command downloads many gigabytes of files | |
---|---|
This command downloads many gigabytes of files, and is intended for use as an example only. Do not run this command if you do not understand the consequences. |
rsync -PHav mirror.example.org::fedora-linux-core/5/ /var/www/mirror/fedora/linux/core/5
The -n
switch performs a "dry run" using
the other given parameters. Use this switch to test any
rsync
command if you are unsure what files
you will receive. See also Possible data loss.
The -z
switch enables compression during the
rsync
process. The server compresses data before
transmission, and the client decompresses the data before writing it
to disk.
Compression using rsync | |
---|---|
The vast majority of the Fedora Core distribution consists of RPM files,
which are already compressed data. Therefore, additional compression
does not save time, and instead induces an unnecessary load on the
upstream mirror CPU. As a courtesy, do not use the
|
The next section features some additional switches that can be used to automatically trim branches from the tree of downloaded folders. With proper usage, they result in a mirror that is exactly as organized and full-featured as any high-volume public upstream site.
Possible data loss | |
---|---|
If you are not exceedingly careful in using these switches, it is possible to delete large portions of your mirrored data. Fixing this problem might require performing the copying steps outlined in Section 2.2, “Copying the Original Distribution” above. On the other hand, if you are also careless about your destination path, and you are running as root, you could put your entire system at risk. Know your environment before using these switches:
|
Use the --exclude
switch, along with a simple
pattern, to disallow download of certain files and/or folders. For
instance, --exclude "*.iso"
excludes the download
of any file whose name ends with the string ".iso".
Use the --delete
switch, again with a pattern, to
remove any file from the local system which does not have a match on
the upstream mirror. This switch prevents unwanted file
debris from cropping up in your mirror. You can also use
it to retroactively trim branches of the tree which you no longer wish
to maintain or download.
Wildcards are permitted with rsync
commands,
including the asterisk *
, question
mark ?
, and brackets
[ ]
. The question mark and brackets
work as in the shell; the former matches any single character, while
the brackets define a set of characters to be matched. Asterisks are
especially powerful when combined with a portion of a file name. The
double asterisk **
pattern matches
any character, including slashes; a single
asterisk *
matches any character, but
stops at a slash. Therefore, be judicious about using either. The
double asterisk is very useful for mirroring a tree that includes
multiple instances of directories and files that contain a pattern. A
good example is mirroring several versions of Fedora Core, where certain
folder names appear in every version.
Pattern matching wildcards | |
---|---|
Use double asterisks to trim out directories that repeat throughout
a mirrored tree. For example, when mirroring for a site that
only uses i386 architecture machines, you may trim all files and
folders marked for x86_64 architecture, using the switch
|
Process a long list of exclusions and deletions with the
--exclude-from
and --delete-from
options. Follow each tag with a file name that includes a list of
patterns, one per line, to be matched by the appropriate option.
These syntax hints only scratch the surface of
rsync
, but suffice to make your first mirror. Once
you have selected your site and formulated your excludes and deletes,
run your rsync
command with the
-n
option. Redirect output to a file so you can
examine the resulting list of files in the editor or pager of your
choice.
The following example mirrors the entire Fedora Core 5 distribution,
with --exclude
options that avoid downloading:
Any information for x86_64 architecture;
Any yum
headers (see Section 3.4, “Configuring Repositories”);
Any debuginfo
packages; and,
CD or DVD images.
The -n
switch is included for testing purposes.
Backslashes at the ends of lines indicate this example is a single
command line.
rsync -Pan --delete --exclude "**x86_64**" --exclude "**headers**" \
--exclude "**debug**" --exclude "**iso**" \
mirror.example.com::fedora-linux-core/5/ \
/var/www/mirror/fedora/core/5
Fedora mirrors are even more useful when they are more than just a snapshot of the distribution at release time. Most mirror administrators also choose to carry updates and errata packages. Repositories of updates or development trees change daily, and your mirror should reflect these changes.
rsync etiquette | |
---|---|
If you plan to do regular updates of your mirror that include large
amounts of data, you should ask permission from the administrator of
the upstream mirror. Downloading nightly package updates for the
official releases of Fedora Core 5 should not require notification, as
they are rarely more than a few megabytes. However, the
|
Once your rsync
command is working as desire, you may
want to place it in a nightly cron
script. The
cron
system allows you to schedule
regularly-occurring jobs on your system. The intervals are highly
configurable, but a nightly run keeps your mirror synchronized with
updates and errata. Make sure your nightly cron
job
follows some simple guidelines:
If your upstream mirror only synchronizes once or twice daily, run your job after the upstream mirror completes its update. This insures your mirror not only gets the freshest material, but also does not interfere with the upstream server's bandwidth while it runs its job. If you do not know this time, it is usually safe to plan your downloads for pre-dawn hours.
Be sure you have sufficient disk space for additional packages. The
updates
tree in particular grows over time as
more errata packages are released.
Always test your script thoroughly before allowing it to run
automatically. Use a -n
or -v
switch in the rsync
command line for testing, and
then remove it once you have completed testing. Remember that the
results are e-mailed to your account on your system unless you
specify differently. Read the crontab(5)
man
pages for additional information, with the command man 5
crontab
.