Sr. Manager - SRE Data Services

Technical Operations Redwood City, California Portland, Oregon Atlanta, Georgia Boston, Massachusetts


Position at Smarsh

Why Smarsh?

From day one, Smarsh was built on a set of core values that have motivated and sustained us – People First, Inspire Confidence, and Embrace the Impossible. We ask that each of our employees – whether new or old- ingrain these values in our day-to-day decisions, call on them while serving our customers and our peers and apply them when creating the best possible products we can imagine.

Together—as one team—we listen, collaborate and believe that anything and everything is possible.

Data Services Manager, Site Reliability Engineering 

Our Site Reliability Engineering team under Global SaaS Operations organization is looking for a seasoned and passionate Data Services Architect help manage exabytes of data across various distributed storage technologies and drive-enable best practices. This position’s scope is not only limited to Data Services technologies currently in use like Ceph, Elasticsearch and MongoDB, Kafka but open to equivalent and upcoming cutting-edge technologies that can replace them.

Responsibilities:

  • Manage and guide a group of Data Services engineers/architects to:
    • Provide 24 X 7 Technology Coverage to address critical incident escalations
    • Administer and manage massive and large volume production deployments and its lifecycle with help from sub-groups within SRE teams and Deliver/Release teams
    • Prototype and come up with best practices and/or deployment/operations guidelines on scale-in, scale-out, fault tolerance and upgrades of these technologies
  • As an individual contributor assist:
    • Deliver or Release team in fresh deployments and their upgrades
    • Production performance issues related to degraded throughput or high latencies
    • Gather actionable insights from production services related to, service usage, resource utilization trends over time and other useful metrics to drive sizing, cost, billing and pricing decisions
  • Establish and maintain a strong working relationship with all team members
  • Successfully manage time and technical responsibilities, set accurate expectations and meet deliverable deadlines while working in a collaborative team environment

Requirements:

  • A minimum of 4-7 years of Linux systems admin experience with at least 2-3 years of experience in managing large scale production deployments of technologies (2 or more) like elasticsearch (indexing service), mongo (noSQL DB) or ceph (Object Storage)
  • 2-3 years’ experience of managing team of 4-6 engineers
  • Experience in Storage Growth/Data Usage trending and analysis
  • Basic scripting (shell or python) experience to automate rollout of tunable, configuration and analyzing logs
  • High energy and passion for leveraging technology advances and industry trends
  • Proven experience collaborating with cross functional teams
  • Excellent attention to detail and complex problem-solving capabilities
  • Excellent interpersonal skills with a demonstrated ability to work in a cross functional team environment
  • Understand Agile methodologies
  • Excellent Documentation skills
  • Strong customer focus