Cloud Ops Senior Problem Manager

Engineering San Jose, California

Splunk Cloud is looking for a Senior Problem Manager to provide day-to-day leadership to our cloud operations center (CNOC) problem management team. This position is responsible for the Problem management process design and continuous service improvements as necessary to achieve the objectives of the business.



The Problem manager will lead the problem management team in the investigations from start to finish by facilitating root cause investigations and managing the implementation of corrective and preventative measures.

  • You will be working with Tier 1,2 & 3 support teams for Cloud Infrastructure to review, agree and implement permanent solutions for “Problem” records.
  • Drive efforts to improve overall infrastructure stability and availability by ensuring problem resolution.
  • Develop and monitor metrics and drive continues infrastructure improvement efforts across teams to achieve SLA & KPIs for Problem management.
  • Initiating actions to fix interruptions to service caused by errors/faults in the IS infrastructure
  • Production of statistics and reports to demonstrate the performance of the Problem Management process
  • Work with process owners and stakeholders to re-engineer processes to be simple, nimble, repeatable, measurable, achievable and continuously improved
  • Eliminate complexity in Technology's ability to deliver IT services, while meeting service level agreements
  • Suggest comprehensive metrics that can be actionable and promote positive behavioral changes; Baseline, improve and re-measure success. Work with the Metrics team to deliver them and use for continuous improvement
  • Manage relationships with other process management teams to provide a consistent delivery framework
  • Work with the requirements, documentation and training teams so processes are implemented in the tools, documented and process users trained in their use
  • Evangelize the virtues of problem management and create a collaborative environment
  • Proactively escalate problems and issues
  • Enhance knowledge of the field through participation in professional organizations and self-study



Who you are:


  • 2-4 years in hands-on manager position.
  • Deep understanding of Cloud (AWS, Azure, GCP).
  • ITIL Foundations Certified with at least one intermediate certification.  ITIL Expert certification a bonus
  • Good track record for innovation and measurable process improvements
  • Strong Technical Writing; Presentation and Communications skills across multiple levels of the organization, including senior management. Must be able to articulate messages across a variety of audiences and document a detailed meeting minutes for the daily Major Problem Management meetings.
  • Self-driven and ability to work independently
  • Have great degree of technical understanding and literacy
  • Collaborative with exceptional social and interpersonal skills.
  • Calm and collected in stressful situations, such as a major service outage.
  • Take charge personality, and the ability to drive a plan to completion.
  • Comfortable working in a dynamic environment with a highly technical team.
  • Demonstrated attention to detail, follow through, and ability to prioritize quickly are necessary.


We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.


Thank you for your interest in Splunk!