Site Reliability Engineer

Computers/Software $job.jobTypeName Bangalore, India ReqID:4493


Description

Role & Responsibilities: 

 

- Build and Maintain the EKS Clusters, Lambda functions. This includes automatic failure remediation, application and systems deployment, capacity planning etc

- Develop and maintain Cloud Infrastructure (networking, load balancers, K8s clusters, EC2) through Infrastructure as Code and other automated approaches.

- Creation of automation tools around deployment,  infra monitoring

- Application and Infrastructure testing (functional, performance, reliability, and failover)

- Be a champion of best practices and help develop the skills of less experienced or differently skilled engineers within our Cloud Engineering team and in the broader SRE organization.

- Share operational burden with other teams, assisting in deployments efforts, troubleshooting, incident response and proactive engineering efforts.

- Work within industry standard team collaboration and development tools (i.e. issue tracking, source code management etc.)

- Collaboration with supporting teams (cloud engineers, database administrators, network engineers, security office, and more) to develop production systems

- Collaborate with Software Engineering teams to optimize the availability, reliability, and performance of production services.

 

Technical Skills

 

·       5-6 years of developing or managing services in a distributed, internet-scale, production environment.

·       Deep knowledge and experience with containerization technologies (Docker, Kubernetes, Helm) and managed Kubernetes solutions (EKS, GKE, AKS).

·       AWS experience: VPC,EC2,AUTO Scaling, ELB, SNS, Security, AWS services, EC2,RDS,EBS,S3,Route 53,CloudWatch,CloudFormation, Auto Scaling,IAM,

              1) Sound knowledge in Kubernetes

                    2) In depth knowledge in Devops implementation 

                                     3) Gitlab

·       Working knowledge of monitoring and observability (SLI/SLO/SLA methodology)

·       Strong knowledge scripting languages.(Shell scripting ,Python)

·       Strong Debugging skills

·       Monitoring Tools: Splunk, Newrelic

·       Strong knowledge on troubleshooting skills of Linux/UNIX Strong Linux operating systems experience end to end Prior working experience