In the days of yore
Infrastructure management was a chore
When some stuff hit the proverbial fan
You did ssh or a script you ran
You prayed it didn’t happen on your shift
And nobody caught your configuration drift
And I finally ran out of rhymes
But those were good times!
Sorry about that. The point I am trying to make (in bad poetry) is it is quite common to fix small problems directly on production servers by directly logging on to them and then back propagating the configuration to where ever you save the configuration. This is such a common practice that there a lot of tools available that will allow you to sync your configuration between production, or even staging, QA, Dev and your source of truth for configuration. For ITIL if you store your entire configuration in CMDB, any product worth its weight offers a discovery tool that facilitates this kind of sync. Conversely if you support your configuration via software like Chef, Ansible etc you might quickly make changes at source and push out your changes to the server. Or make changes to both the places at the same time to keep them in “sync”.
The problems with this approach are immediately clear. Configuration drift can happen between the servers and the source of configuration. Patches are applied every time a new server is built. Long running servers can begin to rot and smell. Servers can become prima donnas that need to be handled in a certain way by certain people or they refuse to work. In short, there are lots of avenues and opportunities for things to go wrong.
The theory of Immutable infrastructure states that infrastructure should be built from golden images and once built no changes should be made to them. If a change is necessary, then the change should be made to the base image and the infrastructure should be redeployed by destroying the old deployment. Each new image is versioned and checked into a version control system. Thus many images can be in used simultaneously and older images can be gracefully degraded as old servers are destroyed, in favour of the new image. Moreover, even if no change is necessary services in prod should be routinely destroyed and recreated to make sure that no configuration drift has occurred and that they adhere to the base image. Immutable infrastructure brings many advantages with it. It prevents configuration drift and divergence. It prevents issues of long running servers like memory leaks and bloat. New version of code, patches and updates can be released faster.
However Immutable infrastructure remains a strict discipline and is very difficult to meet. Since each small fix causes a new image to be built, it can cause an image sprawl. Immutable infrastructure also reduces everyone’s option of pushing changes to all environments by building images, giving no other option for binary push or configuration push. Depending on the change and the setup, this can be a time consuming and costly build option. There is also a question of what should cause an image to be built. Is a small change in production allowed or does it defeat the purpose of immutable infrastructure ? What is the business impact of such a decision ? The answer to that question may lie somewhere in between. Also even if a server might be designated immutable, other artifacts that the server interacts like data, web pages , cache etc are not be immutable. This will either force your to restructure your application or a very particular set of applications will be able to leverage the power of immutable infrastructure. Till then Immutable infrastructure remains a holy grail that very few can meet in it’s purest form.
Do you have experience with immutable infrastructure ? Think you can write a better poem than me ? Please leave a comment below.