Jr. Site Reliability Engineer

Site Reliability Mumbai, India


Description

Must-Have:
3 yrs of experience in the production environment
Working knowledge of Linux and networking
Track record of improving and maintaining monitoring tools (Icinga, Prometheus, Grafana, OpenTSDB)
Incident management skills - must be able to own, cooperate  and resolve large scale incidents under time pressure
Troubleshooting skills to hunt down the root causes of issues and persistence in preventing them from happening again
Experience handling large numbers of diverse systems with configuration management systems like Puppet, Ansible, Terraform
Knowledge of both self-hosted and cloud environments (preferably the Google Cloud Platform)
Ability to work effectively in a globally distributed team structure
Good English skills (B2+) to effectively communicate about technical matters