[OE-core] RFC: OE-Core image creation and deployment

Tue Jul 31 21:13:17 UTC 2012

Refactor of the OE-Core image creation
--------------------------------------

Preface:

The refactoring being discussed below attempts to cover the overall image 
creation through image deployment process that I'd like to see in OE-Core. 
Where existing functionality already exists, the refactoring work would consist 
simply of adapting the items to the overall set of requirements and system 
design.  I'm not proposing a "rewrite", but instead a process of cleanup, 
refactoring of code, and writing new code when necessary.... Evolution vs 
revolution!

A number of people have contributed to the set of requirements discussed below. 
  One of the primary inputs has been the discussion started by Darren Hart 
titled "RFC: Braindump on Bootloaders, Image Types, and Installers".

Comments, suggestions, offers of help are all appreciated!

Purpose:
--------

Refactoring the OE-Core image creation should include:

* A pluggable, configurable environment for creating, deploying and updating a 
filesystem
   * Allow for a machine to specify the overall steps required to deploy a build 
to a device
   * Allow for easy additions to the default OE-Core steps
   * Allow the steps to be started and stopped at any point.  (This is needed 
for on-target installations)

* Existing workflows need to be maintained, new additional workflows (cross or 
target based will be added)
   * Existing workflow of, bitbake <image>, should be maintained with little to 
no difference in behavior
   * New workflows of starting with package feeds should be created, that 
perform, more or less, the same steps as the existing workflows

* Creation of a filesystem from a package feed
   * Either an internal (oe-core build) or external package feed (exported from 
oe-core)
   * Ability to create a filesystem internally to the build system or externally 
(see Angstrom Narcissus)
   * Package Feeds: deb (apt-get), opkg, rpm (zypper or similar)
   * Output of the filesystem generation should be:
     * pseudo controlled directory
     * archive (tar/cpio) file

* Deployment of the filesystem
   * Input is the output of the filesystem generation
   * Support both pseudo based installs and "root" user installs
     * some deployment methods may not be available w/o root permissions
   * Ability to compress the filesystem image
     * lzma/gz/bz2/xz
   * Use the generated filesystem as input for a kernel+initramfs
     * This will likely require a custom kernel+image recipe/method
     * The output of this should be usable in other deployment images
     * Look at CONFIG_INITRAMFS_SOURCE option requirements
     * Look at mkelfimage for post build linking
   * support different deployment mechanisms
     * archive image deployment
       * tar/cpio/...
       * input to local NFS server
     * disk image deployment
       * Construct specific filesystem types:
          * ext2/ext3/ext4
          * other disk image types…
       * Construct a partitioned disk image
         * LVM
         * RAID
         * Primary/secondary partitions
         * Support multiple partition map styles
           * MSDOS, MSDOS-PROTECTIVE
           * GUID Partition Table (GPT)
       * Include bootloader
         * setup of bootloader config files
         * (ia32) setup of MBR
       * Include kernel
         * kernel copied to /boot
         * kernel deployed via other methods (raw partition)
       * Include filesystem on one or more partitions
       * use fstab to specify partition layout and formats
         * Support READ-ONLY mounts
         * be able to specify geometry and other configuration elements
     * ISO/HYBRID image deployment
       * ISO based READ-ONLY filesystem (CD/DVD/USB stick)
       * unionfs and/or other live cd techniques
       * USB stick (or other media), partial-RO/RW
     * FLASH based image deployment
       * Construct specific filesystem types:
         * jffs2
         * cramfs
         * btrfs
         * squashfs
         * ubifs
      * Similar to disk images, construction multiple images for different flash 
regions
        * Include bootloader, kernel, filesystem(s)
      * Ability to specify geometry and other flash specific requirements
   * Installer capable image
     * Installer(s) would be recipes that would be added to the image
     * Package feed and indexes would be copied to media(s)
   * Bootloader configuration
     * Select bootloader
     * Select configuration for a specific bootloader

* Update
   * Ability to use the package feeds to update an existing deployment
   * This should work both cross and on the target

How OE-Core works today:
------------------------

The basic strategy of OE-Core today is to provide a series of specialty recipes, 
classes and configuration variables that together provide the necessary items to 
construct a filesystem image.  The following will specify a few key terms that 
can be used to describe the existing process.

The 'distribution' configuration defines a basic set of configuration data that 
a produced binary distribution will use to be consistent within itself.

A 'package' recipe provides the knowledge to build a single source package into 
one or more binary packages.

A 'task' recipe is used to group together packages as well as other tasks to 
provide some type of functional grouping.  This grouping often includes things 
such as everything required to boot a system.

An 'image' recipe specifies the tasks or packages required to construct a given 
image, from the available set of binary packages.

A 'package feed' is a set of binary packages that comprise the available set of 
software that an image may install from.

When a developer uses OE-Core to build an image, they start by configuring their 
local.conf file and specifically the appropriate image options, and distribution 
configuration.  They then proceed to build up their package feeds, either 
manually by running specific builds, building 'world' or building a specific 
image recipe.  To construct a filesystem, the image recipe is used.  This uses 
the image.bbclass file to construct a basic set of required packages that need 
to be on the image.  Using the classes to construct the rootfs for a given 
package type, this set of required packages is used to construct a solution 
based on the (local) package feed.  The result is a directory that contains a 
functional rootfs based on the parameters of the image recipe.  (Post install 
actions, such as prelinking may also occur at this time.)

The image class further uses a number of configuration resources to determine 
the output format and will construct a basic image output.  This may be in the 
format of a raw ext2/ext3/ext4 image, tar-ball, jffs2 image, etc.

There is no direct deployment built into OE-Core.  So it is up to the user to 
determine how to deploy any bootloaders, kernel, or images onto the target 
device.  (This is often done using end-user/machine specific scripting, or via a 
manual process.)

Each of the items above assume a single machine configuration for the target 
configuration.

Vision of OE-Core image generation in the future:
-------------------------------------------------

Similar to today, the user will configure their local.conf file (distribution 
configuration and associated items).  They will then proceed to build their 
package feeds.  It's suggested that a new recipe type be created, the purpose of 
which is to enable someone to define the overall scope of the packages in their 
distribution, but not actually define what goes into a specific image 
configuration.  For the sake of this document I will call them 'package-feed' 
recipes, but that name is up for discussion.  The 'world' is one such automatic 
package-feed recipe, another may be a system similar to the existing image types 
-- the goal is to make it easy for someone to define the set of available 
packages without having to actually install these packages into an image.

Another way to look at the difference between a 'package-feed' recipe and a task 
recipe or an image recipe is the objective of each.  A task recipe's objective 
is to enable a group of functionality, but a single task recipe is often not 
enough to define a full distribution.  An image recipe is used to select a 
specific configuration of packages that are used to build a special purpose 
image.  A package-feed recipe would be used instead of specify the overall scope 
of binary packages available within a distribution.  This scope may include 
software that no image or task recipe ever references, but a package that an end 
user of the package feeds may add to their system.

The distribution create may stop at this stage or may continue through the image 
and deployment process.  The image and deployment process may also be run at a 
separate time based on the package-feed.

The image recipes, similar to today, would include the basic set of 
functionality, via task recipes or package recipe dependencies on what must be 
installed into a functional image.  In addition, custom image features may be 
defined that enable read-only configurations, live-cd type configurations, 
self-hosted installers, and other similar features.  For instance, install media 
would automatically copy the package feeds to the media, as well as add various 
installer packages and setup files.

In addition to the image recipes, deployment control would be specified.  This 
would control the deployment mechanisms, partitioning, sizes, etc.  It could be 
defined, using the overrides, on a machine specific basis or on a generic 
build/project configuration level.  (Machine/BSP configuration files could also 
tailor this deployment when specific machine configurations are known ahead of 
time.)  The output of the deployment would be instructions, images and files 
ready to be directly deployed onto a target.  For example, a specific machine's 
depoloyment may simply be a set of disk images, including bootloader, kernel, 
initramfs, and various filesystems, along with instructions on how to install 
the disk images.

General source to deployment steps a user would follow:

Initial setup:

1) git checkout oe-core/bitbake/layers...
2) configure distribution and build environment (local.conf and distro.conf files)
3) build distribution 'package-feed' recipe (or world), creating a local package 
feed
4) [optional] push package feed to an external repository
5) construct rootfs/image/deployment based on image [recipe] and machine 
configuration

Updates:

*) (distro.conf changes are not allowed, unless compatible)
1) Add or update recipes
2) Build new/updated recipes, updating the local package feed
3) [optional] push package feed to an external repository
4) construct rootfs/image/deployment based on image [recipe] and machine 
configuration
-or-
   update an existing rootfs/image/deployment based on revised package feeds

Note: for the normal usage scenario of today, step #3 and #4 of the initial 
setup can be ignored, and the image recipe will do exactly the same as today.

There is some desire about supporting deployment for multiple machine types at 
the same time.  While this may be useful, I believe it is out of scope with the 
current work because it changes the workflow of OE.  The current workflow 
assumes you are building for one machine at a time, but you may build multiple 
machines in a single project/build directory.

General binary to deployment steps:
-----------------------------------

For a group of users, they are not distribution experts and expect someone else 
to give them the necessary distribution for their purposes.  To their eyes this 
distribution is embedded specific, but is not a source based distribution. 
Another way to look at this is starting with the embedded Linux distribution 
created by OE, they want to be able to simply use the distribution (like Fedora, 
or Debian) and create applications.

Initial setup:

1) Acquire a package feed
2) Construct rootfs/image/deployment based on image and machine configuration

Update:

0) Start with an SDK or deployed image
1) Build new software components, using traditional package tools
2) [optional] push the packages to the feed
3) construct rootfs/image/deployment based on image and machine configuration
-or-
   update an existing rootfs/image/deployment based on revised package feeds

Tasks:
------

Functional requirements:

*) break up the filesystem generation, and deployment process into logical steps 
that can be run independently, with the appropriate input and configuration

*) ability to run the steps internally to an OE-Core build, or externally using 
just configuration and package feeds as the input.  (Likely python, and even 
bitbake may be runtime requirements for both internal and external execution of 
steps.)

*) pluggable component model that allows the steps being run to change, as well 
as new steps to be added via layers and machine specific configurations

General tasks:

*) Work with OE community and propose the 'package-feed' recipes (or something 
similar) to manage the create of distribution feeds.
    Goal: determine if a 'package-feed' recipe type is reasonable to be added 
and can be defined in a way that it is useful
    Note: If 'package-feed' recipe is not an acceptable name,  I'm looking for a 
suggestion of an alternative name.
    Note: Similarly there have been suggestions that confusion over the "task" 
recipes should likely factor into this work as well.  One suggest was to use the 
term "group".

*) Explore existing image recipes, and image classes and document existing 
control structures
    Goal: ensure that existing workflows are maintained

*) Cleanup of existing image recipes, image classes and related items
    Goal: Take the existing components and cleanup the work to clearly 
differentiate the steps performed in the creation of an image
   Note: this is a first step toward the setting a defined, controllable list of 
tasks per machine deployment (the default set of tasks will be the existing 
behavior)

Filesystem generation tasks

*) Refactor deb, ipk, rpm ROOTFS generation
    *) deb - little to no additional changes expected
    *) ipk - little to no additional changes expected
    *) rpm - big changes, use zypper (or similar) for install vs RPM only approach
   Goal: ensure that the ROOTFS generation is standalone for each filesystem 
type, and create the images in an efficient manner.
   Note: Much of the work is already done here, except as noted the RPM type 
does not yet use a resolver framework such as Zypper.  (Introducing Zypper may 
cause significant additional requirements for native tooling.  This may prove to 
be undesirable, so an alternative to Zypper may be more appropriate.)

*) Build filesystem generation to work external of the bitbake/OE environment
   Goal: allow a filesystem to be constructed from a  package feed, externally 
of the "build" environment.
   Note: bitbake, python components and such may still be useful, but the 
components should be runnable on either a host or target system.

*) Build a handoff to the deployment tasks
   Goal: A standard interface that can be used to seed the deployment tasks
   Note: likely a pseudo controlled directory, or archive of some type
   Note: what we have today is likely enough, we just need to more clearly 
differentiate the steps

Deployment tasks

*) Implement kernel+initramfs (rootfs) creation
   Goal: Provide a kernel+initramfs that can be used as input to a different 
deployment task
   Note: investigate CONFIG_INITRAMFS_SOURCE, mkelfimage and potentially other 
techniques
              the existing live cd/bootimg technique is syslinux based and may 
not be possible on non IA32 architectures
   Note: Build the (initrd) rootfs first, then the kernel in some 
configurations, with the result being a built-in initrd in a single step

*) Implement archive image deployment
   Goal: ability to export a functional tar/cpio or other archive that contains 
a fully configured filesystem
   Note: primary user is expected to be NFS, SDK, and other development users
   Note: likely not much work to do, as the existing system works fine

*) Implement disk image deployment
   Goal: ability to create a full disk image that includes multiple partitions, 
kernel and bootloader as specified by the user
   Note: this is a fairly complex set of tasks… see further tasks below for more 
detail

*) Implement disk image deployment: disk partitioning
   Goal: partition a disk image to match user configurations
   Note: may require root access and physical disks, would like this to work on 
a file

*) Implement disk image deployment: disk partitioning: LVM/RAID/Partition 
configurations
   Goal: Add advanced disk partitioning and setup, this may require creation and 
configuration of filesystem components such as /etc config files.

*) Implement disk image deployment: construct partition filesystems
   Goal: Build a filesystem using ext2, ext3, ext4 or other disk based types
   Note: some types may require root as they lack virtual setup files like 
genext2fs.  loopback mounts may be one way to implement this, given appropriate 
permissions

*) Implement disk image deployment: bootloader setup
   Goal: Implement a way to configure and install a boot loader onto the disk image

*) Implement disk image deployment: kernel deployment
   Goal: Implement a method to deploy a bootable kernel into an image

*) Implement ISO/HYBRID deployment
   Goal: Implement an ISO/HYBRID bootable image for CD/DVD/USB Flash, etc.
   Note: "live cd" like, use unionfs and appropriate RW storage to ensure the 
system functions.
   Note: This was previously implemented but was disabled due to bugs in the 
kernel/userspace integration.

*) Implement flash deployment
   Goal: Ability to create on or more flash regions with appropriate deployment 
components
   Note: components may include bootloaders, kernel, and filesystems

*) Implement flash deployment: construction filesystems
   Goal: Build flash based filesystems such as jffs2, crams, btrfs, squashes, ubifs

*) Add an installer framework that can be used for deployment
   Goal: the installer feature would include installer recipe(s), package feeds, 
indexes and other components necessary to boot and install a runnable system on 
the target device
   Note: this installer framework would be used with any of the deployment 
mechanisms (archive, disk, iso/hybrid, or flash) to build the installer image

*) bootloader configuration
   Goal: add a general framework to specify how to configure and install a 
bootloader