Site Reliability Engineer*US Citizen Skills: (Java, Linux, Kubernetes, Ansible, Terraform, Argo is a Plus)

Information Technology United States


Description

Engineering stack: Core Java, Kubernetes, Linux, Argo.

POSITION SUMMARY

• Monitor all customer-facing applications and infrastructure to ensure they are working optimally.

• Must be a fast-learning individual who is resourceful and has an inquisitive mindset.

• Solve support escalation cases by troubleshooting issues and finding opportunities to improve our application’s

• Must be willing to obtain an industry certification (i.e., AWS Solutions Architect, Redhat RHCE, HashiCorp Terraform Associate, etc.).

• Must quantify failures and availabilities in a prescriptive manner (SLIs, SLOs, SLAs).

• Must work efficiently and have an agile mindset.

• Willing to embrace risk and accept failures and perform post-mortem analysis of such failures.

• Self-driven and motivated to expand knowledge quickly is a must.

• Familiarity with implementation of gradual changes, phased deployments (canary deployments), and intermediate change.

Qualifications:

• Minimum of 2 years of SRE experience.

• College-level associate degree or higher preferred; or equivalent of related work experience.

Technical Skills

Must Have: Core Java Skills, Argo tool experience is a plus.

• Working knowledge of scripting languages such as Terraform, Ansible, CHEF, Python, and CloudFormation.

• Must have a good understanding of VMware hyper-converged infrastructure and architecture and Saas, PaaS, and FaaS Solutions.

• Must have Working knowledge of Kubernetes, System Monitoring, OS Level patching, and overall system and application support.