Without a so-called live Linux system, it is impossible to solve routine technical tasks in a hosting company: LiveCDs are a must-have to reduce the workload on engineers, ensure the stability of service delivery and simplify changes to the deployment process. Unfortunately, the universal images available on the Web are not well suited to our needs, so we had to create our own, called SpeedTest. At first we used it to measure the performance of machines before disassembly, but over time the functionality of the system was expanded to solve other problems with a variety of hardware.
The growth of our needs revealed the shortcomings of the existing system with integrated static scripts. The main one is the lack of simplicity and convenience in product development. We didn’t have our own build system, we didn’t have the ability to add support for new (or old) hardware, and we didn’t have the ability to change the behavior of the same image under different launch conditions.
Problems with the software content in the image
As CentOS was mainly used in our infrastructure (at that time the seventh version), we organized regular image creation through Jenkins on its basis. The imaging kitchen on RHEL/CentOS is beautifully automated with Anaconda Kickstart. The kickstart structure is described in detail in the RHEL documentation – we can’t go into it in detail here, though we will focus on a couple key points.
The header part of the KS file is standard, except for the description of the repositories for downloading the software from which the image will be compiled. This block contains the following directives:
repo --name=elrepo --baseurl=http://elrepo.reloumirrors.net/elrepo/el8/x86_64/
We include the excludedocs directive in the packages block, and to reduce the size of the image, we had to base it on @core and specify the exception packages:
%packages --excludedocs
@core
vim
-audit
-man-db
-plymouth
-selinux-policy-targeted
-tuned
-alsa-firmware
-iwl1000-firmware
-iwl105-firmware
-iwl100-firmware
-iwl135-firmware
-iwl2000-firmware
-iwl2030-firmware
-iwl5000-firmware
-iwl3160-firmware
-iwl5150-firmware
-iwl6000-firmware
-iwl6050-firmware
-iwl6000g2a-firmware
-iwl7260-firmware
The image above will include the @core group + the vim package with its dependencies, but it will exclude a number of unnecessary packages. Further, at the post and post(nochroot) stages, the configuration is refined by scripts. Next to kickstart in the repository are the files that should be included in the image.
The assembly is carried out using the livecd-creator utility included in the standard CentOS repository. As a result, we get a squashfs image (we will cite part of the script executed in Jenkins):
echo -e "\\nSpeedtest release ver ${BUILD_NUMBER}\\n" >> motd
sudo livecd-creator --verbose -f speedtest-${BUILD_NUMBER} speedtest.ks
7z x speedtest-${BUILD_NUMBER}.iso -oisoroot
mv isoroot/LiveOS/squashfs.img ./speedtest-${BUILD_NUMBER}.squashfs
This passage is worth paying special attention to: be sure to number the images and insert the build number into the motd file (its addition to the image should be written in kickstart). This will allow you to clearly understand which build you are working on and track changes in it during debugging. We solve the issue of supporting hardware and additional software using our own RPM repository with packages that are not in the regular repositories or have been modified by our specialists.
Implicit system startup problems
- The kernel and its dependencies enter the system through the @core group, so with each new build, the latest available software versions are included in the image. Accordingly, we need this core and the initramfs for it.
- Building an initramfs requires root privileges, and the system it runs on needs the same core build that will be in the squashfs.
Our advice: to avoid security issues and script errors, build in an isolated environment. We strongly advise against doing this on a Jenkins master server.
We cite the initramfs assembly from the task in the Jenkins DSL format:
shell (
'''
set -x
echo -e '\\n'ENVIRONMENT INJECTION'\\n'
if [ $KERNELVERSION ];then
echo "KERNEL=$KERNELVERSION" >> $WORKSPACE/env
else
echo "KERNEL=$(uname -r)" >> $WORKSPACE/env
fi
short_branch=$(echo $long_branch | awk -F/ '{print $3}')
cat <<EOF>> $WORKSPACE/env
WEBPATH=live-${short_branch}
BUILDSPATH=live-${short_branch}/builds/${JOB_BASE_NAME}
FTPSERVER=repo-app01a.infra.hostkey.ru
EOF
'''.stripIndent().trim()
)
environmentVariables { propertiesFile('env') }
shell (
'''
echo -e '\\n'STARTING INITRAMFS GENERATION'\\n'
yum install kernel-${KERNEL} -y
dracut --xz --add "livenet dmsquash-live bash network rescue kernel-modules ssh-client base" --omit plymouth --add-drivers "e1000 e1000e" --no-hostonly --verbose --install "lspci lsmod" --include /usr/share/hwdata/pci.ids /usr/share/hwdata/pci.ids -f initrd-${KERNEL}-${BUILD_NUMBER}.img $KERNEL
'''.stripIndent().trim()
)
So, we have generated a squashfs image, initramfs and have the latest kernel. These pieces are enough to start the system through PXE.
Delivery and rotation of images
To deliver images, we used an interesting system which is worth examining in a bit more detail. There is a central repository - this is our internal server on our private network, which responds via several protocols (FTP, RSYNC, etc.) and sends information via HTTPS through nginx.
The following directory structure was created on the server:
├── builds
│ ├── build_dracut_speedtest_el8.dsl
│ │ ├── initrd-${VERSION}.img
│ │ └── vmlinuz-${VERSION}
│ ├── build_iso_speedtest_el8.dsl
│ │ ├── speedtest-${BUILDNUMBER}.iso
│ │ └── speedtest-${BUILDNUMBER}.squashfs
├── initrd -> builds/build_dracut_speedtest_el8.dsl/initrd-${VERSION}.img
├── speedtest.iso -> builds/build_iso_speedtest_el8.dsl/speedtest-${BUILDNUMBER}.iso
├── speedtest.squashfs -> builds/build_iso_speedtest_el8.dsl/speedtest-${BUILDNUMBER}.squashfs
├── vmlinuz -> builds/build_dracut_speedtest_el8.dsl/vmlinuz-${VERSION}
In the builds directory with the subdirectories corresponding to the names of the build tasks, we add the last three successful builds, and in the root directory there are symbolic links to the latest build without specifying the version (they work with clients). If we need to roll back the version, we can quickly change the links manually.
Delivery to the server and work with links are part of the Jenkins task in the build: ncftp is used as a client, and proftpd is used as a server (data is transferred via FTP). The latter is important, because it requires a server-client connection that supports working with symlinks. Clients do not interact directly with the central repository: they connect to mirrors that are tied to geographic locations. This approach is needed to reduce the amount of traffic and speed up deployment.
Organizing distribution to mirrors is also quite interesting: a configuration with proxying and a proxy-store directive is used:
location ^~ /livecd {
try_files $uri @central-repo;
}
location @central-repo {
proxy_pass https://centralrepourl.infra.hostkey.ru;
proxy_store /mnt/storage$uri;
}
Thanks to this directive, copies of images are saved on mirrors after the first download by the client. Our system does not contain any unnecessary scripting, and the latest build of the image when updating is available in all locations, anyway it is easy to instantly roll back.
Modifying Image Behavior via Foreman
System ddeployment is carried out through Foreman, that is, we have an API and the ability to pass variables into the configuration files of PXE loaders. With this approach, it is easy to make one image to solve a whole range of tasks:
- To boot on the hardware and investigate hardware problems;
- To install the OS (see our previous article);
- For automatic testing of hardware;
- To disassemble the equipment and completely wipe the hard disk after the client stops using the server.
It is clear that every single task cannot be stitched into one image and dealt with simultaneously. So, we acted differently: we added systemd services and scripts that start the execution of the necessary tasks to the build kitchen. The script and the service have the same name (for example, let's take a look at the start for installing Linux):
Lininstall.service
[Unit]
Description=Linux installation script
After[email protected]
Requires=sshd.service
[Service]
Type=forking
RemainAfterExit=yes
ExecStartPre=/usr/bin/bash -c "if [[ $SPEEDISO == lininstall ]];then exit 0;else exit 1;fi"
ExecStart=/usr/bin/bash -c "/usr/local/bin/lininstall.sh | tee /dev/console 2>&1"
TimeoutSec=900
[Install]
WantedBy=multi-user.target
The service runs the task only if the environment variable SPEEDISO with the value linintsall exists.
Now we need to pass this variable to the image, which is easy to do through the kernel command line in the bootloader. The example is given for PXE Linux, but the solution is not tied to the bootloader, since we only need the kernel command line:
LABEL <%= @host.operatingsystem.name %>
KERNEL <%= @kernel %>
MENU LABEL Default install Hostkey BV image <%= @host.operatingsystem.name %>
APPEND initrd=<%= @initrd %> <%= mainparams %> root=live:<%= host_param('livesystem_url') %>/<%= host_param('live_squash') %> systemd.setenv=SPEEDISO=<%= host_param('hk_speedtest_autotask') %> <%= modprobe %> noeject
IPAPPEND 2
The variable hk_speedtest_autotask should contain lininstall. In this case, when the system starts, the service of the same name is launched. If the variable does not exist, or it has a random value, a system will start from the image to which it will be possible to connect via ssh (if the service launch was configured via kickstart when building the image).
Conclusions
After spending some time on development, we ended up with a managed LiveCD build/delivery system that gracefully handles hardware support and software updates in the image. With its help, you can quickly roll back changes, change the behavior of the image through the Foreman API, as well as reduce traffic and gain high autonomy for the service on different sites. Geographically spaced mirrors contain the latest successful builds of all the used images and repositories, and the system is convenient, reliable and it has helped us countless times during its three years of use.