This posting has been closed |
SRE (Site Reliability Engineer)
Must Skills:
- ITIL/ITSM implementation experience for Application and Infrastructure operation
- Strong SLM/performance Metrics foundation experience
- Experienced in establishing Incident, Problem, MIM, RCA process for large operating environments across legacy and Cloud
- Experienced in architecting ESM and ITSM tools
- Business process dashboarding and metric reporting
- Business process and IT Service mapping experience
- Lead cross-organizational engineering efforts to implement process and training that quickly arm operations teams with deeper insights into application performance and service health issues reducing their MTTR and MTTA for Live Site Issues.
- Continuously improve SRE process by actively participating in Correction of Error reviews, collaborating with engineering leadership to identify and resolve visibility gaps
- Develop and maintain processes, tools, and documentation
Other soft skills/aptitude
- Proven track record of advising senior IT and business leaders
- Demonstrated ability to work with distributed cross-functional teams
- Comfortable navigation of ambiguity and adversity
- User-centered strategic mentality
- Research, analysis, and collaborative decision-making skills
- Ability to break a problem into solution requirements and design a solution that meets those requirements
- Strong presentation and documentation skills
- Business case development and organizational leadership experience
Detailed job responsibilities
- Establish Incident, Problem, MIM, RCA process for large operating environments across legacy and Cloud
- Architect ESM and ITSM tools
- Lead cross-organizational engineering efforts to implement process and training that quickly arm operations teams with deeper insights into application performance and service health issues reducing their MTTR and MTTA for Live Site Issues.
- Develop and maintain processes, tools, and documentation