Skip to content
Guidance on how to make your environment easier to onboard for Web Ops Engineers, SRE's and DevOps Practitioners
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.alexrc
.travis.yml
LICENSE
README.md
package.json

README.md

So you want to Onboard a DevOps Practitioner

Author: Martin Jackson - @actionjack

Build Status

At the moment everyone seems to be so concerned with recruiting DevOps Engineers but I feel the process of on-boarding them is still very hit and miss especially in busy organisations.

Making it easy to get work done from day one

Reduce the time spent learning the peculiarities of complex environments rather than improving or iterating them, with a dedication to making every engineer effective in the shortest possible amount of time.

Here is some guidance on how to make your environment easier to onboard.

Culture

Aim to create a culture of empathy and psychological safety
  • Do not create a Blame and Train culture where mistakes are handled by blaming and shaming the employee (and sometimes terminating their employment) and then train other employees using the incident as an example
  • Introduce the new engineer(s) to the relevant people within the organization
  • Remember not everyone maybe as smart as you are, they may be missing
    • Context / Situational awareness (how did we get from here to there?)
    • Tribal Knowledge
    • Culture
  • What are the Preferred practices or "Design Principles"?
  • Listen to their point of view. Bringing in a new person is a prime opportunity to find out where the code needs improvement
  • Test your mentoring and on boarding process to flush out any shortfalls by getting the last person who joined to mentor the new joiner.
  • Make your documentation inclusive e.g. this document is parsed using alex in order to catch insensitive and inconsiderate writing.
  • Be wary of not overloading new starts with too much information. There is often quite a lot to learn (even more than you think), instead provide a set of useful links so people can research at their own pace.

Have up to date Documentation

Make it easy to understand and do the things

It's important to either have or do the following:

  • Regularly tidy your documentation, old documents should be removed, outdated ones updated, if you touch it then update it
  • High-Level logical Architecture. E.g. ideally written in a Git friendly format:
  • An overview of the company’s infrastructure.
  • Systems integration points and their third party dependencies
  • A intranet/wiki or enterprise social network to Learn about different teams, key members with pictures. On day one, one can easily get overwhelmed with lots of new names and faces.
  • Have documentation for your alerts. If something is important enough to page the on-call person about, it's important enough to have a runbook entry about it. If you alert because foo queue is too long, there should be a runbook entry describing how to fix it.
  • Create a Glossary of Terms [e.g. a Minipedia] for describing any organisation specific acronyms or terms
  • Write your documentation as if it's going to be open to public scrutiny someday.
  • Have an easy to use and setup collection of shared resources e.g. bookmark file of URL links, .ssh/config files

Operations

Make it easy to get shit done

“Complexity exacts a staggering tax on your humans. Good Ops engineers attempt to pay down that tax.”

Charity Majors

  • Simplify and reuse as much of your architecture as possible
  • Have your work structured so people can see what needs to be done i.e. Kanban board backlog or To Do lists
  • Provide information regarding the applications that are maintained by the team and how to do the operations for those applications
  • Make it difficult to make mistakes e.g
  • Ensure your naming conventions makes sense, if something is called build_X and it actually deploys_X then change the name to deploys_X if possible to reduce confusion

"it's possible for good people, in perversely designed systems, to casually perpetrate acts of great harm on strangers, sometimes without ever realising it."

Ben Goldacre, Bad Pharma, p. xi

  • With the above in mind nobody should be able to do something catastrophic to an environment unless they are determined on doing so i.e.
    • Make the right thing the easy thing to do by creating safety harnesses using build or scripting tools like the following list to do the most common tasks safety without the worry of screwing up:
    • Put safe conditionals in your configuration management to do be able to test runs without the worry of screwing up e.g. Ansible tasks:
- name: “Do something really Dangerous"
  command: /sbin/something —could —be —dangerous --if --run --it --in --prod
  when: testmode == “Off"

Processes

How should we do stuff
  • Have Shovel Ready work for new starters, create a backlog of work that be be easily done by a new starter:
    • Ideally work that:
      • is well defined,
      • is easily explained,
      • requires some research,
      • adds value and;
      • is not grunt work e.g. document X.
  • Assign your new start an on boarding buddy/mentor
  • Pair with new start as soon and as often as possible
  • When [and if] you do a Retro, then base it against a known good baseline i.e.
    • If you are doing production deploys in the wee hours of the night and it goes successfully, remember this is not necessarily reflect a good deployment.
  • Put as much detail into tasks / stories as possible including assumptions, reference information, existing implementations, attempt to narrow down the acceptance criteria in order to prevent unnecessary research or rework.
  • Avoid on boarding during crunch times (important or critical planned releases)
  • Ideally have your accounts linked with some central or shared directory e.g. Github/Google/LDAP so your new starters don’t have to remember 101 user/password combinations
  • In your alerting system put context sensitive help that points to a helpful runbook
  • Configuration management test modes i.e. testing_mode on
  • Add or invite individual to any relevant Slack, IRC channels or Mailing lists.
  • Provide information regarding relevant processes e.g.
    • Incident, problem and change management
    • Deploying changes / releases to the different environments
    • Ordering infrastructure / tools
    • Authorization for tools & applications
    • Use of test environments and creating and using testdata
  • Have Clean code It really helps if your code is good, is sensibly organized and structured. If the code base is large, it should be broken apart in understandable segments
  • Create a Papercuts.md in your Repos, These are a log of things that have hurt us in the current environment, they may not be actual technical debt,however they could be things for us to discuss and possibly fix in the future.
  • If you have adopted a particular coding style guideline on your project then document or reference it for new joiners to easily reference and adopt
  • Story kickoffs can be extremely useful to new starters by helping them getting to the mindset of the team, identify areas that aren't immediately visible in the code base and generally reduce constant rework due to missing acceptance criteria.

Version control management

How do we safely change things
  • Document your coding standards and strategies in the open e.g.
  • Have an Up to date README documentation in all repos
  • If at all possible make Pull Requests a first class citizen nothing is more demoralising than having a Pull Request sitting around without feedback and a chance of being merged.
  • Good Pull Requests can also be an excellent teaching tool for new starts or old hands alike, a good PR tell's you want was implement, why and how so if you neeed to do something similar in the future it will make things a lot easier than relying on your memory or tribal knowledge.
  • If you use slack or something similiar consider adding a notification bot for pull request and push activities, e.g. for bitbucket or github to notify your colleagues that a Pull Request is ready for review.

Development environments

How do we safely change things

Useful links

Would you like to know more?

See a problem here

See a problem? Need something clarified? Raise and Issue and I'll try and fix it.

Contributing

I'm open to well structured Pull Requests

  1. Fork it!
  2. Create your feature branch: git checkout -b my-new-feature
  3. Commit your changes: git commit -am 'Add some feature'
  4. Push to the branch: git push origin my-new-feature
  5. Submit a pull request :D

License

MIT © Martin Jackson

You can’t perform that action at this time.