Senior Engineer - Site Reliability/DevOps

Engineering Kanata, Canada


Description

Position at Wind River

Wind River

In a world increasingly driven by software innovation, Wind River is pioneering the technologies to accelerate the digital transformation of our customers with a new generation of Mission Critical AI Systems in an AI-first world with the most exacting standards for safety, security, performance, and reliability.

Imagine being part of a global team that is building the technology foundation for self-driving cars. Life-saving medical breakthroughs. Environmentally sustainable energy. 5G networks. Or safe and smooth landings on Mars. When someone asks, “What did you do today?” you’ll have an awesome answer.

ABOUT THE OPPORTUNITY

Wind River Systems is seeking an experienced high-performing DevOps/Site Reliability engineer for a position operating, maintaining and growing a hybrid, multi-cloud platform relied on by our Engineering teams and customers around the globe.

The successful candidate will be part of a new, highly skilled devops/site reliability engineering team responsible for supporting, maintaining and improving the critical infrastructure of our next generation DevSecOps platform, pioneering many new industry leading capabilities.

The successful candidate will work and collaborate with development and customer facing teams to proactively resolve issues and ensure a high level of availability and reliability, while also safeguarding the security of our platform.

Responsibilities

  • You will be responsible to manage, monitor and maintain tools like GitLab, Jenkins, MinIO, and many more.
  • Develop and enhance the monitoring, alerting and reporting for the entire infrastructure
  • Automate everything – upgrades; updates, patches and new deployments should be non-events
  • Continuously document infrastructure and tools, as well as policies and best practices
  • Prioritize and resolve support requests from engineering teams and provide second level support to customer facing teams for escalated incidents
  • Provide 24/5 monitoring and on-call support, on a rotating basis; some shifts outside of core business hours may be required  

ABOUT YOU

The successful candidate must have experience in cloud-native software development and be a highly adaptable team player who can quickly ramp up on new technologies and accomplish goals in a fast-paced agile environment. A customer-focus mindset coupled with strong technical and communication skills are a must.

Qualifications

  • Experience with Docker and Kubernetes
  • Experience with cloud platforms such AWS and/or Azure
  • Working knowledge of at least one Infrastructure as Code (IaC) framework and tooling such as Terraform (preferred), Ansible, Puppet or Chef
  • Demonstrated ability to coach junior team members to develop and adhere to best practices, and to lead by example
  • 2+ years of experience with at least one of Python, JavaScript/NodeJS, Ruby, Java, C/C++
  • Experience working with Agile methodologies
  • Experience collaborating effectively across remote teams and time zones, and collaborating with Senior Technical Leaders
  • 3-5+ years working in one or more of these roles: software development, DevOps/Site Reliability Engineering, or advanced (L2/L3) technical support
  • Experience with GIT/GitLab, Jenkins, Jira, MinIO, Artifactory, code review tools
  • Excellent communication skills, both written and verbal.
  • BSEE/BSCS or equivalent experience

 

Wind River is an Equal Opportunity Employer with a commitment to diversity. We prohibit discrimination based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.