Senior Member Technical Staff (SMTS) - Cloud Ops
MetricStream is simplifying Governance, Risk, and Compliance (GRC) for modern and digital enterprises. Our market-leading GRC Applications, enable organizations to strengthen risk management, regulatory compliance, vendor governance, and quality management while driving business performance.
MetricStream is leading the way in enabling companies to deploy GRC applications on the cloud. Built on state-of-the-art virtualization and containerization technologies, the MetricStream GRC Cloud is a fast and easy way for customers to have GRC applications up and running with optimal reliability, security, and scalability (https://www.metricstream.com/technology/grc-cloud.htm).
Cloud Operations team is responsible for the day-to-day operations and end-to-end delivery of GRC applications in MetricStream Cloud.
- Strong understanding of n-tier web architecture.
- Experience with provisioning and managing
computeinfrastructure in Public cloud (e.g. AWS, GCP, Azure)
- Hands on experience with Dockers, Kubernetes and related containerization technologies
- Experience in writing scripts (e.g. Python, Shell, PowerShell, Perl) for automating tasks
- Debugging skills in Linux & Windows on Apache Web servers, Tomcat Servers
- Must have strong personal initiative and demonstrated
capabilityto work with little management oversight
- Possesses the ability to work with diverse, integrated, deliverable-driven teams to accomplish the larger mission
- Have an outstanding attitude and a desire to ensure customer success
- Good understanding/knowledge of ITIL/ITSM processes
- Experience in managing & monitoring applications on the cloud on a 24x7 basis, Site Reliability engineering
- Strong comprehension,
problem solving& troubleshooting, analytical and consultative skill.
- Strong written and oral communication skills.
- Knowledge of dependent services such as Directory services, certificate management services
Roles & Responsibility:
- Provision GRC Application instances in MetricStream Cloud using MetricStream’s deployment methodologies that are built on containerization technologies
- Monitor & manage applications & services on MetricStream Cloud to meet uptime SLAs
- Work to quickly resolve incidents, perform root cause analysis and implement solutions to prevent recurrence of similar incidents in the future
- Build automation scripts to deploy, patch, update software across all GRC application instances in MetricStream Cloud
- Perform diagnostic analysis and tune the environment (at the Web, Application and Database layer)
- Analyze runtime logs and Thread/heap dumps (Java heap and/or thread dumps, GC logs etc.) as part of tuning and problem determination
- Work closely with Professional services team and support staff members as needed.