Site Reliability Engineer - 2463

Technology Calgary, Alberta Remote, Canada


Description

Site Reliability Engineer

Why YOU want this position
Enverus is the leading energy SaaS company delivering highly technical insights and predictive/prescriptive analytics that empower customers to make decisions that increase profit. Enverus’ innovative technologies drive production and investment strategies, enable best practices for energy and commodity trading and risk management, and reduce costs through automated processes across critical business functions. Enverus is a strategic partner to more than 6,000 customers in 50 countries. 
We are currently seeking a Site Reliability Engineer to join our Software Development team. This role offers the opportunity to join a rapidly growing company delivering industry-leading solutions to customers in the world’s most dynamic and fastest-growing sector.  
This role is designed around a junior team member who has exposure to systems like Kubernetes, Nomad, ECS or Container Apps. The ideal candidate will be, while junior in experience, very self-driven to explore technologies and their uses to an infrastructure team.  Deep-diving into the deep recesses of a technology will be something they consider fun.  While this is a remote-first position, preference is given to candidates local to Calgary to facilitate training.  This team gets to solve cool problems at scale, with great people.
Day to Day
The day to day focus of this position will involve monitoring and maintenance of the existing Kubernetes clusters, performance and observability analysis in Grafana, interactions with development teams, and training.
Performance Objectives
  • Work on a team that manages our global AWS and Azure presence
  • The team you will be working with is responsible for keeping our Kubernetes infrastructure humming as new releases and maintenance updates are rolled out
  • You will help organize, secure, and automate existing infrastructure and deployments
  • You will work closely with developers to provide feedback and drive operational improvements within our products and operations infrastructure
  • You will be responsible for ensuring that our platform is stable and balanced
  • Maintain high site uptime, while embracing rapid change and growth
  • Scale infrastructure to meet increasing demand and evolving technology
  • Help the dev teams working on our codebases realize zero downtime deployments
  • Develop and improve operational practices and procedures
  • Implement, monitor, and maintain CI/CD frameworks
  • You will coordinate and participate in on-call rotations
Competitive Candidate Profile
  • 2+ years of professional Windows and Linux server administration
  • 1+ years of Azure administration
  • 2+ years of experience within a high-performance, 24x7, DevOps, SysOps, or Operations team
  • You have excellent communication and collaboration skills
  • You demonstrate the ability to succeed in a high-pressure environment with rapidly changing priorities
  • You are an excellent problem solver, and willing to roll up your sleeves to take on any issue thrown your way
  • You have a desire not just to resolve problems, but to fully understand them and prevent them in the future
  • You seek out opportunities to improve, fix bugs, and challenge assumptions
  • You have experience working with global teams (North America, Europe, Asia)
  • You have experience/are familiar with the following technologies:
    • Docker
    • Container Orchestration (Nomad, Kubernetes, ECS, Container Apps)
    • Configuration Management tools (Chef, Puppet, Ansible)
    • Infrastructure as Code (Terraform, Cloudformation)
    • C#, Golang, or Python programming experience is a plus


This role is eligible for: Variable Compensation