Senior Site Reliability Engineer

Engineering Austin, Texas Remote - United States


The Cloud Platform team at Ziff Media Group is looking for team members who are excited and passionate about rethinking the way we deploy applications, maintain our infrastructure and monitor our applications. We define SLAs, incident response procedures and deployment processes while leading by example.

 As a member of our team, you’ll be exposed to technologies such as Kubernetes, Prometheus, Helm, Terraform, Go and much more. 

 As a Senior Site Reliability Engineer you will have the opportunity using Go to expand our infrastructure using the Kubernetes API, build tools to increase developer productivity and create custom Terraform providers.

 If you are someone who understands Go and has experience with or looking to get into Site Reliability Engineering, now is the time!

Who You Are

  • You have at least 4 years of Go Programming experience.
  • You have expertise within the container and container orchestration space (Docker, Kubernetes, etc.).
  • You bring a deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns.
  • You understand networking protocols (TCP/IP, HTTP, DNS, etc).
  • You have a track record of delivering successful solutions and collaborating with others.
  • You have strong interpersonal skills and are familiar with or interested in learning SRE concepts.

What You'll Do

  • You’ll utilize Go, Kubernetes API and Terraform in order to create custom Operators and Providers in order to build native Kubernetes and Terraform packages.
  • You'll be working closely with Docker and Kubernetes to containerize applications and provision infrastructure.
  • You will utilize and teach various Cloudflare services so engineering teams can provide a high SLA for their applications.
  • You will investigate new technologies and tools and recommend those that best fit the team and organization.
  • You will champion SRE methodologies around monitoring, distributed tracing, deployment strategies (e.g. canary, sandbox), and logging.
  • You will identify and execute on opportunities to optimize existing systems, improve infrastructure, and eliminate work through automation.
  • You will educate other engineering teams and advocate for scalable and maintainable architectural decisions.
  • You will participate in our on-call rotation for production services.

Who We Are

  • We have an open environment where engineers are given a lot of responsibility and the freedom to make a huge impact.
  • We have lots of smart people to work with and learn from.
  • We work on large scale challenges with a variety of technologies and believe in an ever-growing diversity of technology platforms.
  • We believe in giving prizes, bonuses, and recognition for doing what you enjoy.
  • We have a phenomenal open vacation policy.

This is a remote/office based position which may be performed anywhere in the United States except for within the state of Colorado.