Site Reliability Engineer III
Designs, develops and maintains underlying systems and processes required to support software platforms.
In concert with platform development teams, is responsible for the overall technical design and operations of platforms
Develops and maintains internal software, implement tooling that empowers efficiency for platform development teams, and automates manual operations, support and deployment tasks
Ensures continuous, high velocity delivery and automated deployment through the use of software provisioning, configuration management, source code management and/or team collaboration applications
You will be accountable to:
- Assist in the design, development, deployment and maintenance of complex IT solutions to make systems more performant and cost-effective
- Develop effective tooling, alerts, and automated responses to identify and address reliability risks; create and manage a roadmap to ensure environments are kept up-to-date.
- Develop the software and processes needed to maintain services.
- Create and deploy automation, alerting, self-healing and other technologies to make the environment more maintainable.
- Set up and configure data application servers according to vendor requirements and Moneris security policy.
- Provide input on developing solutions to assigned business requirements.
- Develop, standardize and maintain system and support documentation (including but not limited to operational guides, network diagrams, access management, infrastructure design, test scripts, ETL flows, etc.)
Your experience includes:
- Bachelor's degree required; or equivalent work experience.
- Minimum 2 years of experience with enterprise-grade infrastructures, operations and / or systems engineering.
- Proficient in computer applications, server configurations, networking systems, database administration, data science tools and technologies, and large-scale distributed systems and processes.
- Strong knowledge of major operating systems, such as Linux, and their administration, as well as of networking, load balancing, protocols such as TCP/IP and services like DNS.
- Proficient in leading projects or project steps and communicating progress/approach with technical and non-technical peers/clients.
- Knowledge of Ansible and its uses for provisioning, configuration management, and application-deployment.
- Strong knowledge of scripting languages such as Bash, Python.
- Proficiency in payment systems and the merchant acquiring business is an asset
Experience with Apache Kafka / Confluent Kafka.
- Provide after-hours support for critical production platforms.
- Familiar with DevOps engineering practices
Note: We welcome and encourage applications from indigenous people, people of colour, people with disabilities, people of all genders, sexual orientation and intersectional identities.