Compiling a Kernel (201.2)

Candidates should be able to properly configure a kernel to include or disable specific features of the Linux kernel as necessary. This objective includes compiling and recompiling the Linux kernel as needed, updating and noting changes in a new kernel, creating an initrd image and installing new kernels.

Key files, terms and utilities include:

Getting the kernel sources

The kernel sources for the latest Linux kernel can be found at The Linux Kernel Archives .

The generic filename is always in the form linux-kernel-version.tar.gz or linux-kernel-version.tar.bz2. For example, linux-2.6.28.9.tar.gz is the kernel archive for version 2.6.28.9. Also see the paragraph on Kernel Versioning above.

Since the release of 2.6.0 in Dec 2003 the numbering scheme is in the form of A.B.C.D, where:

  • A denotes the kernel version. It's only changed when major changes in the code and concept occur.

  • B denotes the revision. Prior to linux 2.6 the even-odd numbering scheme was used here.

  • C is the version of the kernel

  • D counts from the bug and security fixes for the particular C version

This scheme is used (amongst others) to allow for backporting of patches and security fixes. If, for example, a fix was implemented in kernel 2.6.31.4 and this fix is then backported so it can be applied to kernel 2.6.23.7, the resulting kernel would not be numbered 2.6.31.5 or 2.6.32, for obvious reasons. Instead, the resulting kernel that includes that fix would be numbered 2.6.23.8.

A common location to store and unpack kernel sources is /usr/src.

If there is not enough free space on /usr/src to unpack the sources, it is also possible to unpack the source in a different directory. Creating a symbolic link after unpacking from /usr/src/linux to the linux subdirectory in that location ensures easy access to the source.

The source code for the kernel is available as a compressed tar archive, compressed either using gzip (.gz extention) or bzip2 (.bz2 extention). The archive can be decompressed using gunzip or bunzip2 followed by unpacking the resulting archive with tar, or directly with tar using the z (.gz) or j (.bz2) options. Examples:

# gunzip linux-2.6.28.9.tar.gz
# tar xf linux-2.6.28.9.tar

And in a single step:

# tar xzf linux-2.6.28.9.tar.gz

Or for a bzip2 compressed archive:

# bunzip2 linux-2.6.28.9.tar.bz2
# tar xf linux-2.6.28.9.tar

And once more in a single step:

# tar xjf linux-2.6.28.9.tar.bz2

See the manpages for tar, gzip and bzip2 for more information.

Cleaning the kernel

Before configuring or building the kernel it's often a good idea to make sure the kernel is clean. If a kernel has been compiled before with the source tree you will be using, object files and other files will have been created. The make utility will try to work as efficiently as possible and skip files that seem to be 'current'. In some cases this may cause problems with subsequent configuration and/or builds of the kernel. Cleaning is done on three levels:

make clean

Deletes most generated files, but leaves enough to build external modules.

make mrproper

Delete the current configuration and all generated files.

make distclean

Remove editor backup files, patch leftover files and the like.

Running make mrproper before configuring and building a kernel is generally a good idea.

Note

Keep in mind that makemrproper deletes the current configuration as well!

Creating a .config file

The first step in compiling a kernel is setting up the kernel configuration. The configuration information is saved in the .config file. There are well over 500 options for the kernel, which refer to (among many others) filesystem, SCSI and networking support. Most of the options enable kernel features that will either be compiled directly into the kernel or compiled as a module. Some selections imply a group of other selections. For example, when you indicate that you wish to include SCSI support, additional options become available for specific SCSI drivers and features.

Some of the kernel support options must be compiled as a module, some can only be compiled into the kernel and for some options you will be able to select if you want them to be compiled as a module or directly into the kernel.

The results of all of these choices are stored in the kernel configuration file /usr/src/linux/.config. This file is plain text and lists all the options as shell variables.

Example 1.1. Sample .config content

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.28
# Sat Feb  6 18:16:23 2010
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
...

To begin with creating a .config, set the current working directory to the top of the source tree:

# cd /usr/src/linux

There are several ways to set up .config. Although you can do so, you should not edit the file manually (as specified explicitly at the top of the file). Instead, select one of the three interactive approaches. An additional option is available to construct a default configuration. Each set up is started using make.

make config

Running make config is the most rudimentary and has very clear advantages and disadvantages.

The advantage:

  • The single, largest advantage is that it does not depend on full-screen display capabilities of your terminal. This means it's usable on extremely slow links, or on systems with very limited display capabilities.

The disadvantages:

  • The system presents a sequence of questions concerning kernel options which can get tedious.
  • Every questions must be answered before being able to save the .config file and exit.
  • Because you cannot move back and forward through the various questions, mistakes cannot be corrected on the fly. This will require restarting the whole procedure.

An example session looks like this:

# make config
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/basic/docproc
  HOSTCC  scripts/basic/hash
  HOSTCC  scripts/kconfig/conf.o
scripts/kconfig/conf.c: In function 'conf_askvalue':
scripts/kconfig/conf.c:104: warning: ignoring return value of 'fgets', \
    declared with attribute warn_unused_result
scripts/kconfig/conf.c: In function 'conf_choice':
scripts/kconfig/conf.c:306: warning: ignoring return value of 'fgets', \
    declared with attribute warn_unused_result
  HOSTCC  scripts/kconfig/kxgettext.o
  HOSTCC  scripts/kconfig/zconf.tab.o
In file included from scripts/kconfig/zconf.tab.c:2486:
scripts/kconfig/confdata.c: In function 'conf_write':
scripts/kconfig/confdata.c:501: warning: ignoring return value of 'fwrite', \
    declared with attribute warn_unused_result
scripts/kconfig/confdata.c: In function 'conf_write_autoconf':
scripts/kconfig/confdata.c:739: warning: ignoring return value of 'fwrite', \
    declared with attribute warn_unused_result
scripts/kconfig/confdata.c:740: warning: ignoring return value of 'fwrite', \
    declared with attribute warn_unused_result
In file included from scripts/kconfig/zconf.tab.c:2487:
scripts/kconfig/expr.c: In function 'expr_print_file_helper':
scripts/kconfig/expr.c:1090: warning: ignoring return value of 'fwrite', \
    declared with attribute warn_unused_result
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf arch/x86/Kconfig
*
* Linux Kernel Configuration
*
*
* General setup
*
Prompt for development and/or incomplete code/drivers (EXPERIMENTAL) [Y/n/?]

make menuconfig

The make menuconfig method is more intuitive and can be used as an alternative to make config. It creates a text-mode-windowed environment based on the ncurses libraries. It allows you to use arrow and other keys to configure the various kernel options. The sections are laid out in a menu-like structure, which is easy to navigate. The make menuconfig command is illustrated below:

The make menuconfig menu display.

At the toplevel of this menu structure, you can use the arrow keys to select the Exit option at the bottom of the screen. If you made any changes to the configuration, you will be presented with the question if you would like to save the new configuration to the .config file. The last item in the menu at the toplevel of this menu structure allows you to save the configuration to an alternative location and/or filename which can be specified.

Note

If you select to save the configuration to an alternative file or location, you will need to move the file manually to the /usr/src/linux directory if you want it to be used when compiling the kernel!

make xconfig and gconfig

The make xconfig command presents a GUI menu to configure the kernel. It requires a working environment of the X Window System and the QT development libraries to display and will provide you with a menu which can be navigated using a mouse. As an alternative, with the same functionality, but the Gnome look-and-feel, make gconfig can be used. The Gnome alternative requires the GTK+ 2.x development libraries to be available on the system. The figure below illustrates the top-level make xconfig window.

The make xconfig top-level window.

The make gconfig looks slightly different but offers the same functionality:

The make gconfig top-level window.

make oldconfig

The make oldconfig command creates a new .config file, using the options set in the old .config and options found in the source, only requiring user interaction for options that that were previously not configured (new options), for instance after addition of new functionality, or upgrading to a later kernel release.

When using the .config from a previous kernel release, first copy the .config from the previous kernel-version to the /usr/src/linux/ directory and then run make oldconfig. The .config will be moved to .config.old and a new .config is created. You will only be prompted for the answers to new questions; the questions for which no answer can be found in the .config.old file.

Note

Be sure to make a backup of .config before upgrading the kernel source, because the distribution might contain a default .config file, overwriting your old file.

Note

make xconfig, make gconfig and make menuconfig will start with reconstructing the .config file, using the current .config and (new default) options found in the source code. As a result .config will contain at least all the previously set and the new default options after writing the new .config file and exiting the session without any manual changes.

Compiling the kernel

The following sequence of make commands leads to the building of the kernel and to the building and installation of the modules.

  1. make clean
  2. make zImage/bzImage
  3. make modules
  4. make modules_install

make clean

The clean object removes old output files that may exist from previous kernel builds. These include core files, system map files and others.

make zImage/bzImage

The zImage and bzImage objects both effectively build the kernel. The difference between these two is explained in another paragraph.

After compilation the kernel image can be found in the /usr/src/linux/arch/i386/boot directory (on i386 systems).

make modules

The modules object builds the modules; the device drivers and other items that were configured as modules.

make modules_install

The modules_install object installs all previously built modules under /lib/modules/kernel-version. The kernel-version directory will be created if nonexistent.

Installing the new kernel

After the new kernel has been compiled, the system can be configured to boot it.

The first step is to put a copy of the new bzImage on the boot partition. The name of the kernel file should preferably contain the kernel-version number, for example: vmlinuz-2.6.31:

# cp /usr/src/linux/arch/x86_64/boot/bzImage /boot/vmlinuz-2.6.31

The currently available kernel versions can be found in the directory /boot/ as shown below:

# ls -l /boot
total 7088
-rw-r--r-- 1 root root 1572895 Apr 20 20:46 System.map-genkernel-x86_64-2.6.31-gentoo-r10
lrwxrwxrwx 1 root root       1 Apr 20 20:26 boot -> .
drwxr-xr-x 2 root root    4096 Apr 21 12:12 grub
-rw-r--r-- 1 root root 2886832 Apr 20 20:54 initramfs-genkernel-x86_64-2.6.31-gentoo-r10
-rw-r--r-- 1 root root 2759232 Apr 20 20:46 kernel-genkernel-x86_64-2.6.31-gentoo-r10
drwx------ 2 root root   16384 Apr 20 20:23 lost+found

After moving the kernel file to the correct location, you will need to configure the bootmanager to contain the new kernel.

GRUB bootmanager

The GRUB bootloader is described in the section called “ GRUB explained ”. Both version specific and fixed kernel image names can be used with GRUB.

An example kernel configuration entry in /boot/grub/menu.lst (or /boot/grub/grub.conf) is provided:

title GNU/Linux, kernel 2.6.8
root (hd0,0) kernel /boot/vmlinuz-2.6.8
root=/dev/hda1 ro 
initrd /boot/initrd.img-2.6.8 
savedefault 
boot

For more specific information on GRUB, please refer to the section called “ GRUB explained ”.

The initial ram disk (initrd)

In the paragraphs above you have learned how to select modules for your kernel and compile the kernel. Now consider the following: you have a regular PC with a SATA drive attached that will be used to boot from. That SATA drive holds the bootloader (GRUB), the kernel and all the kernel modules. The bootloader correctly manages to find and start the kernel. However, in the configuration of the kernel you chose the hardware support modules for your SATA controller to be compiled as a loadable module. This means that, after loading the kernel, your system does not yet have any knowledge of how to access the hardware that the kernel module for your SATA drive is stored upon. One solution is to go back and build the kernel with the module compiled into it, but there are many reasons why this may not be desirable. To solve this problem, the initrd file was created. The name is short for initial ram disk.

When you look at the configuration option in the kernel:

config BLK_DEV_INITRD
bool "Initial RAM filesystem and RAM disk (initramfs/initrd) support"
depends on BROKEN || !FRV
help
  The initial RAM filesystem is a ramfs which is loaded by the
  boot loader (loadlin or lilo) and that is mounted as root
  before the normal boot procedure. It is typically used to
  load modules needed to mount the "real" root file system,
  etc. See <file:Documentation/initrd.txt> for details.

  If RAM disk support (BLK_DEV_RAM) is also included, this
  also enables initial RAM disk (initrd) support and adds
  15 Kbytes (more on some other architectures) to the kernel size.

  If unsure say Y.

The bootloader loads a ramdisk which can then be mounted as the root filesystem. Once it's mounted as the root volume, programs can be run from it and kernel modules loaded from it. After this step a new root filesystem can be mounted from a different device. The previous root (from initrd) is then either moved to the directory /initrd or it is unmounted.

The bootprocess

When using an initrd, the system boots goes through the following steps:

  1. The boot loader loads the kernel and the initial RAM disk

  2. The kernel converts initrd into a normal RAM disk and frees the memory used by the initrd image.

  3. The initrd image is mounted read-write as root

  4. The linuxrc is executed (this can be any valid executable, including shell scripts; it is run with uid 0 and can do everything init can do)
  5. After linuxrc terminates, the real root filesystem is mounted

  6. If a directory /initrd exists, the initrd image is moved there, otherwise, initrd image is unmounted

  7. The usual boot sequence (e.g. invocation of the /sbin/init) is performed on the root filesystem

Moving the initrd from / to /initrd does not require unmounting it. This means it's possible to leave the processes that use files on that volume running and the filesystems mounted during the move. If /initrd doesn't exist, initrd will remain mounted if it is in use by any running processes. This also means it will stay in memory.

There is caveat because of this behaviour: any filesystems mounted under initrd remain accessible, but the entries in /proc/mounts will not be updated. Another thing to keep in mind is that if /initrd doesn't exist, the initrd cannot be unmounted. It will disappear during the bootprocess and any filesystems mounted in it will also disappear. This will prevent them from being re-mounted. It is a strong recommendation to unmount all filesystems before switching from the initrd filesystem to the normal root filesystem. This includes, for instance, the /proc filesystem.

The memory used for the initrd image can be reclaimed. To do this, you have to use the command freeramdisk after unmounting /initrd.

Including initrd support to the kernel adds options to the boot command line options:

initrd=
This option loads the file that's specified as the initial RAM disk.
noinitrd
This option causes the initrd data to be preserved, but it's not coverted to a RAM disk and the normal root filesystem is mounted instead. The initrd data can be read from /dev/initrd. If read through /dev/initrd, the data can have any structure so it is not necessarily a filesystem image. This option is used mainly for debugging purposes.

Note

The /dev/initrd is read-only and it can be used only once. As soon as the last process has closed it, all memory is freed and /dev/initrd can't be accessed any longer.
root=/dev/ram
The initrd is mounted as root and subsequently the /linuxrc is started. If there's no /linuxrc the normal boot procedure will be followed. Without using this parameter, the initrd would be moved or unloaded, however in this case, the root filesystem will continue to be the RAM disk. The advantage of this is that it allows the use of a compressed file system and it's slightly faster.

Manual initrd creation

To be able to start with creating the initrd manually, you will need to prepare the normal root file system. First, check if the /dev/initrd device node exists. If not, you can manually create it:

# mknod /dev/initrd b 0 250
# chmod 400 /dev/initrd

Next, the directory where the initrd image will be mounted needs to be created:

# mkdir /initrd

If the root filesystem was created during the boot procedure (for instance when you are creating an install floppy), the root filesystem creation procedure should perform all these actions.

Note

Neither /dev/initrd nor /initrd are strictly required for the correct operation of initrd at this stage, but it's more convenient to experiment with it if they are available. The /initrd mount point can also be used to pass data between the image and the real file system.

To be able to work with the initrd images, the kernel has to be compiled with support for the RAM disk and the initial RAM disk enabled. Also, all modules and components to execute the programs that may be stored on the initrd (for instance the executable format and the filesystem type that is used in the image) must be compiled into the kernel.

The next step is to actually create the RAM disk image. You create a filesystem on a block device and then copy the files to that filesystem as needed. Suitable block devices to be used for this purpose are:

  1. A floppy disk (very slow and newer machines may lack a floppydrive these days)
  2. A RAM disk (fast, but allocates physical memory)
  3. A loopback device (the most elegant solution, which allocates disk space)

In the rest of this example we'll use the RAM disk method, so we will need to make sure a RAM disk device node is present (there may be more than one):

 
# ls -la /dev/ram*
brw-rw---- 1 root disk 1,  0 Feb 13 00:18 /dev/ram0
brw-rw---- 1 root disk 1,  1 Feb 13 00:18 /dev/ram1
brw-rw---- 1 root disk 1, 10 Feb 13 00:18 /dev/ram10
brw-rw---- 1 root disk 1, 11 Feb 13 00:18 /dev/ram11
brw-rw---- 1 root disk 1, 12 Feb 13 00:18 /dev/ram12
brw-rw---- 1 root disk 1, 13 Feb 13 00:18 /dev/ram13
brw-rw---- 1 root disk 1, 14 Feb 13 00:18 /dev/ram14
brw-rw---- 1 root disk 1, 15 Feb 13 00:18 /dev/ram15
brw-rw---- 1 root disk 1,  2 Feb 13 00:18 /dev/ram2
brw-rw---- 1 root disk 1,  3 Feb 13 00:18 /dev/ram3
brw-rw---- 1 root disk 1,  4 Feb 13 00:18 /dev/ram4
brw-rw---- 1 root disk 1,  5 Feb 13 00:18 /dev/ram5
brw-rw---- 1 root disk 1,  6 Feb 13 00:18 /dev/ram6
brw-rw---- 1 root disk 1,  7 Feb 13 00:18 /dev/ram7
brw-rw---- 1 root disk 1,  8 Feb 13 00:18 /dev/ram8
brw-rw---- 1 root disk 1,  9 Feb 13 00:18 /dev/ram9

On this system there are 16 RAM disk device nodes.

Note

The number of RAM disks that is available by default on a system is an option in the kernel configuration: CONFIG_BLK_DEV_RAM_COUNT.

Next an empty filesystem needs to be created of the appropriate size:

# mke2fs -m0 /dev/ram0 300

Note

If space is critical, you may want to use a filesystem which is more efficient with space, such as the Minix FS).

After having created the filesystem, you need to mount it on the appropriate directory:

# mount -t ext2 /dev/ram /mnt

Now the console device node needs to be created. There is obviously already a console device node on your system in /dev, but this will be the device node that will be used when the initrd is active.

# mkdir /mnt/dev
# mknod /mnt/dev/tty1 c 4 1

Copy all the needed files to the image. Do not forget the most important file of all: /linuxrc (which, while we set this up, will actually be in the /mnt directory). Make sure this file is given execute permissions. If you wish to experiment, you can make a symbolic link from /linuxrc to /bin/sh:

# ln -s /bin/sh /mnt/linuxrc

After you have completed copying the files and have made sure that the /linuxrc has the correct attributes, you can unmount the RAM disk image:

# umount /dev/ram0

The RAM disk image can then be copied to a file:

# dd if=/dev/ram0 bs=1k count=300 of=/boot/initrd

Finally, if you have no more use for the RAM disk and you wish to reclaim the memory, deallocate the RAM disk:

# freeramdisk /dev/ram0

To test the newly created initrd, add a new section to your GRUB menufile, which refers to the initrd image you've just created:

title=initrd test entry
root (hd0,0)
kernel /boot/vmlinuz-2.6.28
initrd /boot/initrd

If you have followed the steps above and have rebooted using this test entry from the bootloader menu, the system will continue to boot. After a few seconds you should find yourself at a command prompt, since /linuxrc refers to /bin/sh, a shell.

Create initrd using mkinitrd

The mkinitrd is a tool which is specific to RPM based distributions (such as RedHat, SuSE, etc.). This tool automates the process of creating an initrd file, thereby making sure that the relatively complex process is followed correctly.

In most of the larger Linux distributions the initrd contains virtually all kernel modules and very few will be compiled directly into the kernel. This enables the deployment of easy fixes and patches to the kernel and it's modules through RPM packages: an update of a single module will not require a recompilation or replacement of the whole kernel, but just the single module, or in the worst case a few dependent modules. Because these modules are contained in the initrd file, this file needs to be regenerated every time the kernel is (manually) recompiled, or a kernel (module) patch is installed. Generating a new initrd image is very simple:

# mkinitrd initrd-image kernel-version

Useful options for mkinitrd include:

--version
This option displays the version number of the mkinitrd utility.
-f
By specifying this switch, the utility will overwrite any existing image file by the same name.
--builtin=
This causes mkinitrd to assume the module specified was compiled into the kernel. It will not look for the module and will not show any erros if it doesn't exist.
--omit-lvm-modules, --omit-raid-modules, --omit-scsi-modules
Using these option it is possible to prevent inclusion of, respectively, LVM, RAID or SCSI modules, even if they are present, or the utility would normally include them based on the contents of /etc/fstab and/or /etc/raidtab.

Create initrd using mkinitramfs

Dissatisfied with the tool the RPM based distributions use (mkinitrd), some Debian developers wrote another utility to generate an initrd file. This tool is called mkinitramfs. The mkinitramfs tool is a shell script which generates a gzipped cpio image. It was designed to be much simpler (in code as well as in usage) than mkinitrd. The script consists of around 380 lines code.

Configuration of mkinitramfs is done through a configuration file: initramfs.conf. This file is usually located in /etc/initramfs-tools/initramfs.conf. This configuration file is sourced by the script, so all the contents of this file is in standard bash format. Comments are prefixed by a #. Variables are specified by:

variable=value

Options that can be used with mkinitramfs include:

-d confdir
This option sets an alternate configuration directory.
-k
Keep the temporary directory used for creating the image.
-o outfile
Write the resulting image to outfile.
-r root
Override the ROOT setting in the initramfs.conf file.

Note

In the debian(-based) distributions, for 2.6 kernels you should always use mkinitramfs for creating an initrd image. The mkinitrd utility does not have proper support for various newer features of the 2.6 kernels and will break due to lack of drivers which are no longer available in that kernel but are expected nonetheless.

Setting the root device

The standard settings in the kernel are used by default. These defaults can come from or be overridden by the following sources:

  • the defaults as they were compiled in,
  • the defaults as set with the rdev,
  • the value as passed to the kernel at boot time through the root=/dev/xyz, or
  • the value as specified on the kernel line in the GRUB configuration file

Besides using the more obvious blockdevices (harddisks, SAN storage, CDs or DVDs) as the root device, it is also possible to use initrd with an NFS-mounted root. This requires the use of the nfs_root_name and nfs_root_addrs boot options.

Aside from using the above mentioned options to set or change the root device, it is possible to change it from within the initrd environment. In order to do so, the /proc has to be mounted by the scripts in the initrd image. If this is the case, the following files are available:

/proc/sys/kernel/real-root-dev
/proc/sys/kernel/nfs-root-name
/proc/sys/kernel/nfs-root-addrs

The real-root-dev refers to the nodenumber of the root file system device. It can be easily changed by writing the new number to it:

# echo 0x301>/proc/sys/kernel/real-root-dev

This will change the real root to the filesystem on /dev/hda1. If you wish to use an NFS-mounted root, the files nfs-root-name and nfs-root-addrs have to be set using the appropriate values and the device number should be set to 0xff:

# echo /var/nfsroot >/proc/sys/kernel/nfs-root-name
# echo 193.8.232.2:193.8.232.7::255.255.255.0:idefix \
     >/proc/sys/kernel/nfs-root-addrs
# echo 255>/proc/sys/kernel/real-root-dev

Note

If the root device is set to the RAM disk, the root filesystem is not moved to /initrd, but the boot procedure is simply continued by starting init on the initial RAM disk.
Copyright Snow B.V. The Netherlands