Cloud Site Reliability Engineer

Software Engineering Bonifacio Global City, Philippines


Description

Responsibilities

  • Design and implement monitoring and observability for AWS EKS-based systems.
  • Build scalable, cost-efficient infrastructure and deployment models using Terraform.
  • Partner with product and support teams to improve reliability and customer experience through automation.
  • Drive incident response, root cause analysis, and continuous improvement initiatives.
  • Participate in on-call rotations to ensure high service availability.

Qualifications

  • 1-3 years in SRE, DevOps, or Software Engineering roles supporting large-scale cloud systems.
  • Strong hands-on experience with AWS and Kubernetes (EKS).
  • Skilled in Datadog or similar monitoring tools (Grafana, Prometheus).
  • Proficient in Terraform for infrastructure as code.
  • Familiarity with CI/CD pipelines and DevOps best practices.