DevOps Engineer (SRE)

Engineering Pune, Maharashtra


Description

  • Technology Skills Required for SRE: AWS (must have):

    VPC, EC2, Load balancers, Auto scaling, EBS, Kinesis, S3, Lambda, CFT, CloudWatch, EKS, ECS

    OS: Primarily Linux and partly Windows Servers Scripting: Powershell, Python, Bash or Ruby
    Should have experience on Monitoring: Monitoring tool like DataDog, SumoLogic
    Should have experiance on Patch Management
    Application Insights, OMS, Logic Monitor, Sumologic etc. Container
    Orchestration: Docker or Kubernetes knowledge

    Experience/ Background: 2-4 years in a role of SRE Engineer, Support
    Expert for SaaS solution with the above-mentioned technologies.

    Should have certifications in 2 or more of the related technologies.
    Qualification: BE/ BTech/ MCA Other Important Requirements:

  • Other skills:
    Must to have:Excellent in communication.

    Ready to work on 24x7 rotational shift model.

    Also, should be available on call during week-offs or on public holiday.



    The team this role belongs to is expected 24x7 available (on call and
    availability in office) as it’s about managing global production and
    customer facing highly critical systems. Hence the individual should be
    flexible to adapt the roster/shift arrangements as required. Additional
    compensation will be provided for working in other than normal business
    shift in IST timezone as per company policy.

  • Role in Project:
    SRE Primary responsibilities include:
     ·   Live with 24x7 available mind set for ensuring 99.97% uptime on customer platforms and applications.   
    ·   Own, resolve and restore major technical issues to meet the uptime commitment. Expected to be available on-call any time (24x7) for
    ·   Develop, deploy and continually improve the telemetry, monitoring and automation (self-heal, self-help, self-service) of the SaaS platform and the applications
    ·   Ensure the Cloud Infrastructure, platform components and applications are secured and safeguarded via strong controls, monitoring and security incident management 
    ·   Own Root Cause Analysis of incidents end to end and demonstrate quantifiable technological, stability and process improvement of Customer Infrastructure, SaaS platforms and applications
    ·   Enable technology support teams, customers and business users by building and continually developing knowledge base driven by analyzing practical usage/issues and related challenges.
    ·   Will be highest level of Technical Escalation point and act as guide, coach and mentor for first and second level Application/Infrastructure support teams. Should be the
    · bridge between Support and Product engineering teams and faces customers and business users as and when required proactively.
    ·   Owns and drives the end to end technical resolution of critical incidents which might need involvement from multiple parties and ensures the right collaboration and communication is maintained to ultimately get the issue resolved fast paced through shortest and the most efficient path.

  • Project/Work Details:
    Working on Building Monitors and Alerts
    Patch Management
    Working on Customer internal and production Support Requests
    Working on DevOps and AWS infra related technical project tasks