DevOps (SRE) Engineer

Product Brno, Czechia


Description

Site Reliability Engineer

Brno, Czech Republic

 

The Opportunity: 

Anthology delivers education and technology solutions so that students can reach their full potential and learning institutions thrive. Our mission is to empower educators and institutions with meaningful innovation that’s simple and intelligent, inspiring student success and institutional growth.

 

The Power of Together is built on having a diverse and inclusive workforce. We are committed to making diversity, inclusion, and belonging a foundational part of our hiring practices and who we are as a company.

 

For more information about Anthology and our career opportunities, please visit www.anthology.com. 

 

As a member of the Site Reliability Engineering team, you will combine software and systems engineering to help build and run large-scale, distributed and fault-tolerant systems. This is a driven, creative, and energetic team that works in a flexible and agile fashion to deliver world-class products to the education market. You will become a core contributing member to the Site Reliability Engineering team delivering eLearning services to over a thousand customers, comprising almost 4 million users worldwide.

 

Specific responsibilities will include:

  • Engaging with development teams on the design, deployment, capacity needs and operations of microservices, and supporting them as they transition to production
  • Monitoring the availability, performance, and health of production systems in support of meeting service level objectives
  • Using automation and tooling to continuously improve the reliability, scalability, and velocity of services deployed on AWS
  • Providing support to issues escalated from the Customer Engagement Support team and interfacing with development teams to hand-off application issues
  • Participating in emergency incident response on-call rosters. Practicing blameless postmortems that lead to improvements in resiliency and reductions in pager fatigue

 

The Candidate:

Required skills/qualifications:

  • 2-5 years of relevant professional experience
  • Experience in the fields of Computer Science, Software Engineering or related fields
  • Expertise with analyzing and troubleshooting large-scale, multi-region deployments in a public cloud (e.g. AWS)
  • Experience with deployment and management tools (e.g. AWS CloudFormation)
  • Experience with monitoring and alerting tools (e.g. AWS CloudWatch, New Relic, PagerDuty)
  • Demonstrable scripting experience, preferably in Python 
  • Ability to solve complex problems, optimize code, and automate routine tasks
  • Fluency in written and spoken English

 

Preferred skills/qualifications:

  • Degree in Computer Science or related field, or equivalent work experience
  • Experience with containerization and Infrastructure As Code Provisioning Tools and AWS CDK, working experience with GitHub
  • Experience with relational databases (SQL), ideally Snowflake
  • Experience with network and/or application security
  • Prior experience within the education industry and/or with e-learning technologies

 

This job description is not designed to contain a comprehensive listing of activities, duties, or responsibilities that are required. Nothing in this job description restricts management's right to assign or reassign duties and responsibilities at any time.

 

Anthology is an equal employment opportunity/affirmative action employer and considers qualified applicants for employment without regard to race, gender, age, color, religion, national origin, marital status, disability, sexual orientation, gender identity/expression, protected military/veteran status, or any other legally protected factor.