Principal Site Reliability Engineer

Engineering Pune, India


Do you thrive in a fast-paced and dynamic startup environment? Do you like being part of a rapidly growing organization and growing with it? Privacera is a leading provider of software products that Fortune 1000 organizations use to discover, organize, catalog, protect, and govern their cloud and on-premise data assets. At Privacera, you will collaborate with talented colleagues who are passionate about delivering value for enterprise customers.Privacera has recently started their operations in India.

We’re looking for:

As a Principal Site Reliability Engineer (SRE) at Privacera, you will be a foundational part of the team which ensures the reliability, availability, and security of our services and platforms for our customers.  You must have demonstrated and be capable of an extreme ownership mentality.  A successful Principal SRE at this company must have strong facility coding in Java (preferred) and/or Python, as well as in bash.   You will need to quickly become proficient in understanding how each design, component, configuration, and process is linked to form an end-to-end solution.  You will have strong experience in deploying and managing first-tier monitoring, logging, and dashboarding platforms.

Your responsibilities (along with your colleagues) will include:  

  • Automating the creation, deployment, testing, securing, and overall management of our infrastructure and services.  This requires an ability to understand key details about our services, the majority of which are written in Java.
  • Developing quality assurance methodologies for your code, including creating and validating your own unit tests..
  • Creating and using modern Continuous Integration/Continuous Deployment (CI/CD) pipelines and tooling . . . specifically using Cloud-native technologies; and being able to create the pipelines in such a way that they can scalably be used by the typical engineer.
  • Taking responsibility for ensuring our offerings are secure and compliant with modern frameworks.
  • Fixing various issues in our production environments without involving other teams most of the time.
  • Mentoring junior engineers.
  • Serving in an on-call rotation.
  • Creating root cause analysis (RCA) documentation; and host and participate in meetings on such topics involving multiple stakeholders.
  • Designing and implementing monitoring, logging, and dashboarding platforms across Cloud providers and regions.

Your experience, skills, and capabilities should include:  

  • 15+ years experience as an SRE, Platform Engineer, full-stack java developer, etc.
  • 10+ years experience managing mission-critical web applications at scale
  • 10+ years are a developer, preferably in Java (and/or possibly Python)
  • 8+ years with substantial experience in AWS
  • very deep experience with various Cloud-native monitoring, logging, and dashboarding platforms (including vendor-specific platforms like CloudWatch and CloudTrail; and third-party platforms like New Relic, FireHydrant, DataDog, PagerDuty, Prometheus, etc)
  • A strong ability to perform solely within an infrastructure-as-code (IaC) framework using; this means intimately knowing Terraform and/or Cloudformation in our case.
  • Strong experience with Gitlab pipelines, AWS CodeBuild/Codedeploy/Codepipeline, etc.
  • Deep understanding of kubernetes, including but not exclusive to vendor implementations of such (e.g., AWS EKS)
  • Being an excellent verbal and written communicator in English.  Explaining and documenting are key functions of this role.
  • Experience working in a fast-paced startup environment.
  • B.Tech./M.Tech. in Computer Science and Engineering or MCA or MSc. in Computer Science or Equivalent