Site Reliability Engineer*US Citizen Skills: (Java, Linux, Kubernetes, Ansible, Terraform, Argo is a Plus)
Description
Engineering stack: Core Java, Kubernetes, Linux, Argo.
POSITION SUMMARY
• Monitor all customer-facing applications and infrastructure to ensure they are working optimally.
• Must be a fast-learning individual who is resourceful and has an inquisitive mindset.
• Solve support escalation cases by troubleshooting issues and finding opportunities to improve our application’s
• Must be willing to obtain an industry certification (i.e., AWS Solutions Architect, Redhat RHCE, HashiCorp Terraform Associate, etc.).
• Must quantify failures and availabilities in a prescriptive manner (SLIs, SLOs, SLAs).
• Must work efficiently and have an agile mindset.
• Willing to embrace risk and accept failures and perform post-mortem analysis of such failures.
• Self-driven and motivated to expand knowledge quickly is a must.
• Familiarity with implementation of gradual changes, phased deployments (canary deployments), and intermediate change.
Qualifications:
• Minimum of 2 years of SRE experience.
• College-level associate degree or higher preferred; or equivalent of related work experience.
Technical Skills
Must Have: Core Java Skills, Argo tool experience is a plus.
• Working knowledge of scripting languages such as Terraform, Ansible, CHEF, Python, and CloudFormation.
• Must have a good understanding of VMware hyper-converged infrastructure and architecture and Saas, PaaS, and FaaS Solutions.
• Must have Working knowledge of Kubernetes, System Monitoring, OS Level patching, and overall system and application support.