Senior Member Technical Staff (SMTS) - Cloud Ops

Cloud Platform Development [US]Technology Training Palo Alto, California


Description

 

Job Description

 

Designation: Senior Member Technical Staff (SMTS) - Cloud Ops

Reports to: Engineering Manager

Team: Cloud Operations

Experience: 3 years or more

Education: BS in Computer Science or equivalent

Certifications: Preferably AWS certifications or equivalent certifications

 

MetricStream is simplifying Governance, Risk, and Compliance (GRC) for modern and digital enterprises. Our market-leading GRC Applications, enable organizations to strengthen risk management, regulatory compliance, vendor governance, and quality management while driving business performance. 

MetricStream is leading the way in enabling companies to deploy GRC applications on the cloud. Built on state-of-the-art virtualization and containerization technologies, the MetricStream GRC Cloud is a fast and easy way for customers to have GRC applications up and running with optimal reliability, security, and scalability (https://www.metricstream.com/technology/grc-cloud.htm).

Cloud Operations team is responsible for the day-to-day operations and end-to-end delivery of GRC applications in MetricStream Cloud.

 

Mandatory Skills:

  • Strong understanding of n-tier web architecture.
  • Experience with provisioning and managing compute infrastructure in Public cloud (e.g. AWS, GCP, Azure)
  • Hands on experience with Dockers, Kubernetes and related containerization technologies
  • Experience in writing scripts (e.g. Python, Shell, PowerShell, Perl) for automating tasks
  • Debugging skills in Linux & Windows on Apache Web servers, Tomcat Servers and Databases
  • Must have strong personal initiative and demonstrated capability to work with little management oversight
  • Possesses the ability to work with diverse, integrated, deliverable-driven teams to accomplish the larger mission

Preferred Skills:

  • Have an outstanding attitude and a desire to ensure customer success
  • Good understanding/knowledge of ITIL/ITSM processes
  • Experience in managing & monitoring applications on the cloud on a 24x7 basis, Site Reliability engineering
  • Strong comprehension, problem solving & troubleshooting, analytical and consultative skill.
  • Strong written and oral communication skills.
  • Knowledge of dependent services such as Directory services, certificate management  services

 

 

Roles & Responsibility:

  • Provision GRC Application instances in MetricStream Cloud using MetricStream’s deployment methodologies that are built on containerization technologies
  • Monitor & manage applications & services on MetricStream Cloud to meet uptime SLAs
  • Work to quickly resolve incidents, perform root cause analysis and implement solutions to prevent recurrence of similar incidents in the future
  • Build automation scripts to deploy, patch, update software across all GRC application instances in MetricStream Cloud
  • Perform diagnostic analysis and tune the environment (at the Web, Application and Database layer)
  • Analyze runtime logs and Thread/heap dumps (Java heap and/or thread dumps, GC logs etc.) as part of tuning and problem determination
  • Work closely with Professional services team and support staff members as needed.