By Zachary Crownover – Technical Consultant

When first starting out at a company and building the infrastructure out from the ground up, you’re inevitably going to need to start the install process for the first systems in an inefficient manner. The first few systems will always end up being launched from ISOs, be it by way of CD, DVD, or USB, until you advance your technology and automation stack to the next level. When your initial bottleneck is one person with one piece of physical media (CD, DVD, USB) at one server, you’re in a situation that doesn’t scale very well. The more ideal target area for scalability is one where a single person can spawn multiple new systems and manage multiple systems simultaneously.

In the earliest stages of advancement in your automation, you will invariably need a PXE or network booting system. Debian based systems will refer to this as preseeding and Redhat based systems will refer to this as kickstarting, but both ultimately rely on the same technology under the hood.

Your ultimate goal should consist of the following concepts:

  • Moving away from one-off systems
  • Having a baseline of consistency
  • Reducing the amount of time spent bringing up anyone system
  • Being able to manage more systems than you were able to before

As you progress beyond several systems and move into tens, hundreds, or even thousands, it becomes ever more important to ensure consistency beyond the initial deployment. It is vital that any change you need to apply to any singular system can be applied equally well to all that you manage. Once you start down this path of automation, removing old hindrances and bottlenecks, anything done manually is a regression.

For people in the Redhat family, systems such as Spacewalk and Katello that can fill in the gaps that configuration management systems leave behind. People using systems on the Ubuntu side of the family will see the same benefits from Landscape. Tools like these can help you manage from a web interface the versions of software installed for reasons such as patching against CVEs and seeing all systems affected by those CVEs. They can even assist in the provisioning process, often with Cobbler, in conjunction with your configuration management system, acting as the PXE point. Any configuration management system you choose to use should be able to take a minimally provisioned system of your choosing, with just enough software installed on the system to load your configuration management software and continue taking care of all the rest of the steps to match your desired configuration.

When I first started out in the *nix world, everything I was doing was hand configured and time-consuming. I was only working with a few small groups of systems at a time, and I wasn’t easily able to expand beyond that to manage near the number that I can and do now. Once you introduce a configuration management system to the mix, you enter a world where systems can check in to a master or cluster of master systems to find out how they should be configured and a fix to an issue can propagate to thousands of servers simultaneously without a need to log in to each one, one by one. Additionally, from the ground up you can have a minimal system that you want provisioned to fill a specific role simply fall into place by talking to the configuration management system and being set up with everything that it will need.

The advantages of configuration management don’t stop there. You have deeper insight into what comprises any role of any system you have when done correctly. Knowing this produces reliable, consistent systems. This allows you to at the same time, scale-out and reduces downtimes. If a system crashes, you can spin up one in its place just applying the correct role to it.

Ultimately though, they all boil down to the same few things. Resources, providers, system information, and often some sort of external lookup system.

Resources are the abstracted aspects of the managed portions of the system. At the most basic level, they tend to be filed, optionally templated, services, and packages. There are others, but those are by far the most abundant ones and most basic ones until you expand further. For an example list of resources, you can see all those available to Puppet athttps://docs.puppetlabs.com/references/latest/type.html.

Providers are the lower-level components that service the resources and are typically abstracted away from you. They’re the parts that define what package managers are supported for a package resource, as well as what package manager has defaulted for your system. For any given resource, there is at least one provider, which is then selected and used to process your configuration that you specify.

System information comes in a variety of forms; Chef users get system information in the form of ohai attributes, while Puppet users get it in the form of factor facts. Both are largely similar in what they do though. They are scripts that run locally on each system, collecting things like hostnames, IP addresses, OS family, OS, version of the OS, and using that to submit back to the providers. The providers look to see, given your information, what providers are available for any given resource, and assign the default. This is the core of the configuration management system that really powers the decision making the ability of everything else.

Lastly, most also have a concept of data stored outside the actual configuration itself. In Puppet, it’s referred to as exported resources, stored formerly in-store configs, and now in PuppetDB. In Chef it’s a built-in feature accessed using the node search. Using these types of things, rather than explicitly defining configuration data, data can be defined and retrieved dynamically from other systems. For instance, every web app might have the role of ‘webapp’ in your configuration management system, and every proxy might have a role of ‘proxy’. The proxies could have their configuration in a template, rather than being hardcoded, with a small piece of code that queries the configuration management backend server for all systems that have the ‘webapp’ role attached to them. In this way, you can have a very dynamic configuration.



The capabilities are limitless. Nearly all the major configuration management projects out there are open source. If they don’t do what you need, you can add the feature, and/or submit it upstream. The skills are in high demand and it’ll make your life in the world of systems administration much easier when you have an otherwise unmanageable number of systems to take care of at all times. Happy hacking.