Sunday, January 3, 2021

Ops Oriented Development

Opinions expressed are solely my own and do not express the views or opinions of my employer

TLDR; 
 User features are important and delivering them fast is critical to be successful, but this doesn't need to be a nightmare for Operations. 


As a user ... 

A familiar story follows and talks about what is needed for the user to complete his/her task. Thousands of developers around the globe every day pick up one of these stories and deliver value for their business. Most firms alongside these developers operate a mighty set of L1 and L2 support teams to support the production systems. 

Distancing themselves from arguments about GraphQL vs Rest, React vs Angular, Java vs Node these folks act as real reviewers for our system. I assume some of them will have the following questions when they are asked to support a new application. 
  •  How far are users' expectations from what was built? 
  •  Is the system transparent enough to help answer the user's question? 
  •  How quickly can an L1 or L2 answer a question before throwing their hands up and escalate the issues to L3? 
 I am sure some teams out there discuss these along with the stellar feature development. Logging, in particular, seems to be something that most developers use to help themselves rather than support teams. Do we have standards across the organization's applications to log certain information in a certain format? Splunk, DataDog etc are excellent tools but still, supporting every new application requires experimentation with queries and specific attributes.  Multiple teams and subgroups within a larger team tend to handle this in their own special way.

 Intuitive user experiences are highly regarded in today's web/mobile development, Can we have a similar experience for support/operations teams? Maybe, some of you are gathering requirements for monitoring and are not just adding it after the development as an afterthought. 

Deployment using architectural patterns and abstracting away language/ecosystem specific nuances simplified the deployment side of the chaos, all thanks to container abstractions and shift left paradigms bringing developers closer to deployment engineering. 

A similar focus on 'running the show' is much needed ...