Professionals who specialize in cloud operations with focus on DevOps with strong understanding of SRE practices including observability (logging, tracing, alerting) with a vision for the use of AI in operations. A consultant with a mix of knowledge and skills in software development and cloud platforms, with experience to advise clients on how to analyze their challenges, advise, design, build, test, and deploy changes while maintaining a cloud operating model e.g. DevOps & ITSM process and tools.
- seasoned client facing consultant/architect that has advised IT executives on cloud operating models e.g. IT processes and tools - and can craft proposals after initial presales meetings
- experience working closely with production operations, application developers, system, network, middleware and database administrators to streamline development, operations and support processes
- experience in leading DevOps teams, establishing pipelines for cloud and application development, and managing the velocity, quality and performance of the cloud and the applications.
- adept at analyzing and problem solving and preferably have a blend of platform, middleware, network and software development skills
- very nice to have: experience with consulting methodologies, knowledge management and service offering development (to assist in building cloud practice offerings from sales through delivery)
- apply consulting and engineering skills to solve operations problems by:
- Defining and driving initiatives to increase the client‘s overall application development velocity , quality and availability
- Building tooling needed to improve DevOps and observability of development and operations performance/efficiency
- Enhancing monitoring and management tooling to better detect, diagnose, and correct problems
- Identification and resolution of defects/problems in the cloud or application code for an incident, when applicable
- Team with application developers to support pipelines for new features and incident response automation
- Driving the transformation of delivery methods into the operational teams such as network, database, system administrators, Incident management
- Enabling an AIOps strategy and roadmap to drive more predictive and automated response
- Investigate RCA resolution to get to, and correct, the source of issues and outages.
- Ideally a former Developer who knows how to support development with DevOps and SRE automation including troubleshooting applications transactions end to end and critical points of failure or bottlenecks.
- DevOps/GitOps understanding with a vision for how to automate analysis, assignments, decisions and actions to support and operate a platform and application
- Cloud Native dashboarding & alerting. (minimally familiar with AWS, GCP and Azure with depth in at least 1)
- Experience with scalable cloud native architectures and performance tuning.
- Enjoy solving difficult engineering problems, approach troubleshooting systematically, and comfortable getting hands-on to guide engineers and operators
- Great communication and planning experience ideally with large consultancy background
- Ability to own all or part of an assessment to develop recommendations and a roadmap