Operating and administrating Internet infrastructure is not the same. For the uninitiated it may seem that ‘operators’ and ‘administrators’ (systems or network) are not different, but there is one important distinction: The primary job of an operator is “risk aversion”, while that of an administrator is to “control, manage, or maintain.” Given that most operators also “control, manage, or maintain,” this is a nuanced distinction. I was recently reminded of this distinction by the Overcast podcast (#4).
In Overcast #4, James Urquhart, Geva Perry, and Greg Ness explore why network engineers are a hold out when it comes to cloud computing. It’s very apparent why to me. Network engineering is a mature discipline and most of the folks engaged in it are, by their nature, network operators more than network administrators and extremely risk averse.
Of all of the cloud computing bloggers, I’m one of the most hands-on and technical. I’ve been in the position of maintaining network, systems, storage, and security infrastructure for almost 20 years now in a variety of positions. This gives me a unique perspective as an actual administrator and eventually operator of large scale infrastructure. That’s why I can say definitively that operators and administrators are not the same thing.
This is how I usually describe an operator:
An operator is a systems or network administrator who, when in the midst of a new deployment or change, sees the possibility for making a minor alteration to the upgrade or change plan that is low risk, but high reward in terms of fixing or enhancing the plan/infrastructure and then chooses not to do it.
Operators understand that even though the odds may be 1 in 100, or 1 in 1,000 that something may go wrong. It’s unacceptable to make a change to a plan that has not been tested or vetted. This is how the folks who run utilities or telcos also think. Failures will happen and your job as an operator is to reduce the risk and avoid making bad choices even in the face of temptation. The actual technical skills required to administer the infrastructure are just a byproduct of necessity, not your primary job.
This distinction and the importance of operating 24x7 highly available infrastructure will only become more critical as cloud computing becomes the de facto standard for building next generation infrastructure. In effect, every administrator needs to deliver rock solid cloud services, internal or external, to their consumers.