BOSH (software)
BOSH is an open-source software project that offers a toolchain for release engineering, software deployment and application lifecycle management of large-scale distributed services. The toolchain is made up of a server and a command line tool. BOSH is typically used to package, deploy and manage cloud software. While BOSH was initially developed by VMware in 2010 to deploy Cloud Foundry PaaS, it can be used to deploy other software. BOSH is designed to manage the whole lifecycle of large distributed systems.
Since March 2016, BOSH can manage deployments on both Microsoft Windows and Linux servers.
A BOSH Director communicates with a single Infrastructure as a service that provides the underlying networking and virtual machines . Several IaaS providers are supported: Amazon Web Services EC2, Apache CloudStack, Google Compute Engine, Microsoft Azure, OpenStack, and VMware vSphere.
To help support more underlying infrastructures, BOSH uses a concept of Cloud Provider Interface. There is an implementation of the CPI for each of the IaaS listed above. Typically the CPI is used to deploy VMs, but it can be used to deploy containers.
Few CPIs exist for deploying containers with BOSH, and only one actively supported. For this, BOSH uses a CPI that deploys Pivotal Software's Garden containers on a single virtual machine, run by VirtualBox or VMware Workstation. In theory, any other container engine could be supported, if the necessary CPIs were developed.
Due to BOSH indifferently supporting deployments on VMs or containers, BOSH uses the generic term “instances” to designate those. It is up to the CPI to choose whether a BOSH “instance” is actually a VM or a container.
Workflow
Once installed, a BOSH server accepts uploading root filesystems and packages to it. When a BOSH server has the necessary bits for deploying a given software system, it can be told to proceed, as described by a YAML deployment manifest. BOSH then progressively deploys “instances”, using canaries to avoid deploying failing configurations.Once a software system is deployed, BOSH monitors its instances continuously to allow detecting failing instances, and resurrecting any missing one.
When a BOSH deployment manifest is changed, BOSH accepts to roll out the implied modifications proceeding progressively, instance by instance. This means that BOSH can upgrade live clusters with possibly no downtime.
Concepts
Release
A BOSH release can either be an archive file or a git repository. In both cases, it describes a software system that can be deployed with BOSH. For this purpose, it packages up all related binary assets, source code, compilation scripts, configurable properties, startup scripts and templates for configuration files.BOSH releases are made of “packages” and “jobs”. Roughly, BOSH packages provide something that can be run, and BOSH jobs describe how these things are configured and run.
A BOSH package details the necessary source code, binary assets, and compilation scripts for building a given software component. There are two ways to provide binary “blobs”. In a BOSH release that is provided as an archive file, blobs are directly included. But with BOSH releases that are provided as git repositories, doing the same tends to be problematic when blobs get big. That's why a BOSH release provides a concept of “blobstore”, from where referenced blobs can be fetched. Most BOSH releases use blobstores that are backed by public Amazon S3 buckets, but there are other ways to refer to a private or a local “blobstore” in a BOSH release.
BOSH packages are always subject to a compilation phase, even if this just extracts files from an archive and copies them to the proper target directory. To compile a given package, BOSH spawns an ephemeral compilation instance that only includes any required packages and blobs, as declared by the package specification. In this dedicated instance, BOSH runs the compilation script, and seals the compilation result in its database, so that it can be safely used for reproducible deployments.
BOSH jobs on the other hand, provide configuration properties, templates for configuration files, and startup scripts. BOSH jobs refer to one or many packages as dependencies. Jobs are also sealed into BOSH database, but the templates for configuration files are rendered at deploy time, where all configuration properties are resolved. These configuration properties are usually IP addresses, port numbers, user names, passwords, domain names, etc.
Stemcell
A BOSH stemcell packages the basics for creating a new instance. Namely, a BOSH stemcell ships an Operating System image along with a BOSH agent and a copy of monit, which is used to manage the services that will be hosted by the instance. The BOSH agent helps BOSH communicate with the instance during all its life cycle.The stemcell concept in BOSH is similar to Virtual Machine Images like Amazon's AMIs, but BOSH stemcells are not meant to be specialized for any particular usage. Instead, BOSH only provides different stemcells for supporting different Operating Systems, or different underlying IaaS providers.
The name “stemcell” originated from biological term “stem cells”, which refers to the undifferentiated cells that are able to grow into diverse cell types later. Similarly, instances created by a BOSH stemcell are identical at the beginning.
After inception, instances are configured with different CPU/memory/storage/network, and installed with different software packages. Hence, instances built from the same BOSH stemcell can behave differently.
BOSH Agent
The BOSH agent is a service that runs on every BOSH-deployed VM. It does the following:- sets up the VM, e.g., configures local disks, configure and format attached disks, configures networks
- accepts requests from director, e.g., pings, job management requests
- manages jobs: starting, stopping, and monitoring health
Deployment
In most cases, users don't work with deployment manifest as one big YAML file. Instead, deployment manifest are split into smaller files that are easier to maintain. These separate files are merged by tools like spiff or spruce, right before they get uploaded to the BOSH server and deployed.
In a deployment manifest, all configuration properties, as declared by jobs from all referenced releases, can be customized. Different jobs can refer to configuration properties with same name, in order to share common settings.
Key principles
BOSH was purposefully constructed to address the four principles of modern release engineering in the following ways:Identifiability
Being able to identify all of the source, tools, environment, and other components that make up a particular release. In its concept of “release”, BOSH packages up all related source code, binary assets, configurable properties, compilation scripts, and startup scripts. This allows users to easily track what is actually deployed, and how it is run. Additionally, BOSH provides a way to capture the root filesystems that will be the basis of deployed instances, as single images called “stemcells”. BOSH releases and BOSH stemcells are identified by UUIDs and sealed by SHA-1 checksums.
Reproducibility
The ability to integrate source, third party components, data, and deployment externals of a software system in order to guarantee operational stability. BOSH tool chain provides a centralized server for operating the deployed systems. This server holds software “releases”, Operating System images, persistent data, and system configuration. Therefore, a given deployment is guaranteed to reproduce an identical result.
Consistency
The mission to provide a stable framework for development, deployment, audit, and accountability for software components. BOSH achieves such consistency with its software “releases”, that bring a consistent framework for developing and deploying the software systems. Moreover, audit and accountability are provided by the BOSH server, which allows users to see and track changes made to the deployed systems.
Agility
The ongoing research into what are the repercussions of modern software engineering practices on the productivity in the software cycle, i.e. Continuous Integration. BOSH tool chain integrates well with current best practices of software engineering by providing ways to easily create software releases in an automated way and to update complex deployed systems with simple commands.
History
Designed to address shortcomings found in available tools to manage Cloud Foundry. Chef was used originally, but was limited in its ability to package, spin up/down servers, limited in monitoring and self-management capabilities. Originally developed for Cloud Foundry’s own needs, but the project has now grown to be completely generic, and can be used for orchestration of other software such as Hadoop, RabbitMQ, MySQL and similar platform or application software.Architecture
A BOSH installation is made of several separate components that can possibly be split across different VMs or containers:- A Director that is the “brain” of the server
- The director database, made of a PostgreSQL instance, a Redis instance and a Blobstore for storing compiled packages and jobs
- A Health Monitor that keeps track of instances status
- Many BOSH agents, one on each deployed instance
- A NATS message bus for connecting the Director, the Health Monitor, and all the deployed BOSH agents
- A CPI, which is just an executable binary conforming to some specific API
Cloud / Platform / OS compatibility
BOSH connects to the underlying IaaS layer through an abstraction called the CPI. There are CPIs available for Amazon Web Services, certain OpenStack versions, vSphere, vCloud. Some community maintained CPIs exist for Google Compute Engine, Microsoft Azure and CloudStack.Deployment
BOSH can be deployed as a BOSH release, which may create a “chicken or egg” surprise for newcomers.A BOSH server is not the only software that can deploy BOSH releases. There is a BOSH provisioner project that can deploy BOSH in a VM, a Docker container, or a bare metal server. This component is used by the BOSH packer provisioner, which creates a Vagrant box running BOSH-lite, which is what most users rely on when learning BOSH.