Nowadays, having the fast boot has become an important aspect of the product. Standard Linux distributions are designed for generic desktop purpose and are not optimized for single critical application that should boot faster. However it is possible to optimize the boot time of Linux OS, thanks to the modular design and open source nature of Linux. This article explains the various techniques that can be used for optimizing the boot time of Linux. Instead of going deeper into the technical details of the Linux, this article just identifies various areas where one can possibly save the time. Also this article explains how these techniques have been used to boot Linux in eSOMiMX6 module within 1 second |
|
2. General Principles
|
|
One should have the full understanding of the following principles before starting boot time optimization. |
|
- Choose the correct method to measure the timings of each software module.
- Understand the boot sequence of the platform.
- Always start with the software module that takes the longest time and address it first. Starting with minor things could lead to a waste of time.
- A big part of boot time is actually loading the images from the storage to RAM. Reading less means loading fast.
- Start the critical application as soon as possible and initialize only the modules that are required to start the critical application before that.
|
|
3. Measuring Boot Time
|
|
There are several options to measure the load time of the various parts during the OS boot. One option is to use the serial grabber functionality which would send the debug messages with the timestamp and one can determine the load time of each module by examining the timestamp of each messages. Advantage of this option is that using serial grabber is the easiest method as it doesn't require any external instrument. Another option would be to toggle any of the GPIO and use CRO to measure the time. Though it requires external CRO and some modifications in the source code to toggle the GPIO, this will be the effective measurement as it gives back the concise measurement. My recommendation would be to use the GPIO and CRO to measure the boot time. |
|
4. Boot Sequence
|
|
|
Before starting to work on the boot time optimization of any Linux platform, it is important to understand the boot sequence of the device. Here is the general boot sequence of the Linux platform. |
|
- ROM Code from Processor
- Uboot (Bootloader)
- DDR and Non-Volatile Memory Initialization
- Initializing and configuring all the peripherals
- Reading Kernel and Rootfs images from Non Volatile Memory
- Loading Kernel
- Initializing the platform hardware
- Loading Built-in Driver Modules
- Loading the Rootfs
- Mounting File System
- Loading driver modules from File System
|
|
One has to note here that the boot sequence would vary among the various platforms due to the following factors. |
|
- Processor's SRAM size - As the ROM code is loaded into processor's SRAM, if the size of the SRAM is not enough to hold the entire uboot, there may be another loader called x-loader (also called as MLO) which will be small in size. For example, OMAP3 processors will have x-loader to load the uboot.
- Boot Device (MMC, NAND Flash, etc.)
|
|
5. Optimization Techniques
|
|
5.1 Optimizing Bootloader
The following techniques would help to optimize the boot time of bootloader.
- Removing or disabling the features that are unnecessary for the bootloader is the primary task in optimizing the bootloader.
- It is better to keep the bootloader as minimal as possible. Bootloader should only initialize the RAM and Non-Volatile memory and then read kernel and rootfs images from non-volatile memory and load the kernel to RAM.
- Using just the bootstrap instead of full uboot would also reduce the boot time. Have a separate full-fledged uboot for testing and use smaller bootstrap to load the kernel in production.
- Remove the boot delay that's usually there in uboot before loading the kernel.
- Make sure bootloader is reading the exact size of kernel and rootfs. Many loaders read up to the default maximum size irrespective of size of kernel and rootfs images.
- Choosing the correct compression for the kernel images is a crucial while optimizing the boot time. Choosing the right compression techniques depends upon the balance between the storage reading speed and the CPU performance to decompress the kernel. Based on this, one should analyse with the various compression techniques before coming up with the correct compression method which gives good trade-off between read and decompress times.
|
|
|
5.2 Optimizing Kernel
Using the following techniques would help to optimize the kernel boot time.
- As in the boot loader optimization, always start with disabling the features that are unnecessary for the product. Disabling unnecessary features not only helps in the reducing the kernel load time but also reducing the size of the kernel image which in turn reduces the time taken to read the kernel/rootfs images from the Non-Volatile memory.
- Compile all the drivers that are not needed during boot time as modules so that these modules can be loaded once the critical application is launched.
- Debug output on console may also take lot of time. So remove all the debug messages and keep only error messages.
- Reorder the loading of the driver modules in such a way that modules that are required to run the critical application are loaded at first and then the remaining modules can be loaded in the background after the critical application is launched.
|
|
5.3 Optimizing Root File system and RAM File system Usage
|
|
5.3.1 Choosing the right File System
Optimizing the file system is generally one of the most important things to be worked on in boot time optimization as different file systems can take different times for initialization and mount times. The type of file system directly impacts the boot time. The different mount time is due to the different read, write and access performance of the various file systems. One should have the complete understanding of all the file system types and purpose of the file system in the final product. Choosing the right file system is again depends on the trade-off between the performance of the file system and purpose of the file system in the final product. |
|
|
5.3.2 Using Temporary RAM File System
One important step in improving the boot time is to use the RAMFS file system temporarily till loading the critical application and load the root file system in background. In one of our product developments we used this technique where we were able to achieve 1 second boot time! |
|
6. Demo applications
|
|
6.1 Booting to Linux console within 1 seconds in eSOMiMX6
Above mentioned techniques have been used in order to boot standard Linux console in 1 second in one of our System on Modules 'eSOMiMX6' . Following boot binaries are optimized to achieve this
- u-boot bootloader - This is prepared by removing/uncommenting all unwanted u-boot commands and u-boot debug messages. DMA mode is used to copy kernel, rootfs images from Non-Volatile memory to RAM memory
- Linux kernel image - This is prepared by removing all unwanted features in kernel configuration and lpj is set in bootargs. CPU freq is set to max for loading rootfs quickly
- Rootfs - This is prepared by just having the Busybox
|
|
6.2 Booting Qt Camera Preview within 3 seconds in eSOMiMX6
Above mentioned techniques have been used in order to boot Qt Camera Preview within 3 seconds in one of our System on Modules 'eSOMiMX6'. Following boot binaries are optimized to achieve this
- u-boot bootloader - This is prepared by removing/uncommenting all unwanted u-boot commands and u-boot debug messages. DMA mode is used to copy kernel, rootfs images from Non-Volatile memory to RAM memory
- Linux kernel image - This is prepared by removing all unwanted features in kernel configuration and lpj is set in bootargs. CPU freq is set to max for loading rootfs quickly
- Rootfs - This is prepared by just having the Qt camera app and its dependent libraries (like qt libraries and libc library).
Camera Driver Kernel Module and Application - This is prepared by setting camera sensor settings to required 720P resolution settings and USERPTR memory allocation method is used to reduce the time for frame copy between camera driver and display driver.
|
|
|