Principal Site Reliability Engineer (Cloud Infrastructure)
At Palo Alto Networks®, everything starts and ends with our mission:
Being the cybersecurity partner of choice, protecting our digital way of life.
We have the vision of a world where each day is safer and more secure than the one before. These aren’t easy goals to accomplish – but we’re not here for easy. We’re here for better. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking for innovators who are as committed to shaping the future of cybersecurity as we are.
Palo Alto Networks has been rapidly moving towards the future where cloud-based applications are increasingly common. As a Site Reliability Engineer, you will develop the frameworks and pathways to help move our internal applications to microservices. You will be a critical link between engineering and the Infrastructure Platform, building Infrastructure as Code and working in partnership with the App developers to deploy the applications in GCP, AWS and data centers across the globe.
As a member of the SRE team, you will work on producing mission-critical platforms, tools, and processes that will ensure the highest levels of availability and reliability of all our applications. We need creative and innovative problem solvers who can partner with our Application development teams to make their services more usable. Our SRE team is furnished with a standout opportunity to build tools, frameworks, and cloud platforms that will support our company's growth over the next decade. If you are a self-starter and jump on new ideas to make the platform more stable, secure and feature-rich, this is your new career.
- Write automation code for provisioning and operating infrastructure at massive scale
- Design, build and operate Cloud infrastructure to enable reliable and rapid deployment of microservices with effective monitoring and resilient operations
- Work with development teams to make sure the applications are production ready, scalable and reliable from the grounds up
- Identify and drive opportunities to improve automation for code deployment, management, and visibility of application services
- Develop tools and framework to automate operational tasks, deployment of machines, services, applications
- Establish end-to-end monitoring and alerting on all critical components of the application
- Participate in the on-call rotation supporting the platform and or the production application
- Directs root cause analysis of critical business and production issues
- Develop and mentor other SREs on standard methodology from Infra orchestration and troubleshooting application service in production
- Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness
- Expertise in configuration management with a framework such as Salt, Ansible, or Terraform
- Experience in DevOps, Site Reliability, or infrastructure engineering
- Expertise in Google cloud computing (GCP) and its related services
- Strong experience with Linux
- Proficiency with a programming language like Python and shell scripting to automate tasks
- Familiarity with CI/CD pipeline, GitHub, Jenkins, Artifactory
- Subject matter expert of one or more technologies e.g. Elastic Search, Kafka, Hadoop, GCP or databases
- Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
- Strong fundamentals in HTTP including HTTP headers and web servers
- BS or MS in Computer Science, a related field, or equivalent professional experience
- Excellent problem solving, critical thinking, communication, and teamwork skills
- Excellent written and verbal communication, able to collaborate and rally support
- Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive
- Passion for automation and monitoring instrumentation in the code
- Excellent interpersonal skills and the ability to work well in a team
- Passionate to learn, understand, and dissect new technology stack quickly on own
We’re not your ordinary Information Security team. We’re a diverse group of security professionals that accepts challenging the status quo in order to protect Palo Alto Networks and our customers.
Driving innovation on the Information Security team of the fastest-growing high-tech cybersecurity company is a once in a lifetime opportunity. You’ll be joined by the brightest minds in technology, and our global teams are on the front line of defense against cyberattacks.
We’re trailblazers that dream big, take risks, and challenge cybersecurity’s status quo. It’s simple: we can’t accomplish our mission without diverse teams innovating, together.
We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at firstname.lastname@example.org.
Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.