Site Reliability Engineer, Windows Operations Automation
Diligent is the world’s largest GRC SaaS provider, serving nearly 1 million users from 25,000 organizations around the world. Our software enables holistic and informed conversations about governance, risk and compliance and ensures CEOs, CFOs and the board have an integrated view of audit, risk, information security, ethics and compliance from across the organization.
Our world-changing idea is to bring technology, insights and confidence to leaders so they can build more effective, equitable, and successful organizations – and create lasting, positive impact on the world. We seek to empower organizations to be better for their stakeholders and communities, for their customers and employees, for their bottom line.
Headquartered in New York, Diligent also has offices in Washington D.C., London, Galway, Budapest, Vancouver, Bengaluru, Munich, and Sydney.
Diligent is looking for a Site Reliability Engineer to join our Center for Global Product Innovation in Budapest. The Site Reliability Engineer will be responsible for the support of multiple global SaaS applications. The ideal candidate will be passionate about automation and have good troubleshooting skills. We are looking for a team player with excellent communications skills, who is committed to continuously improving and delivering results.
- Maintenance and support of global applications which run on Microsoft Windows within Diligent’s datacenters and in the cloud.
- Support development teams and work with them to help operationalize their features and provide expertise on scalability, high availability, and performance.
- Define operations playbook that integrates with ServiceNow and our incident management systems.
- Best-practices administration as it relates to SAAS technologies, confidential data storage and data encryption.
- Day-to-day activities include automation of deployments throughout the software development lifecycle and supporting infrastructure scalability.
- Work with teams to develop Service Level Objectives and indicators, to improve application performance and availability via log and metric analysis.
- Advanced Windows administration experience including Powershell.
- In-depth knowledge of scripting is required; you will be managing infrastructure and applications in 10 different datacenters around the world.
- Cloud experience (AWS or Azure) is a plus
- Experience with Kubernetes or related containerization software is highly desirable.
- Experience with GitHub.com.
- Experience with Jenkins, Puppet and Ansible is desirable
- Signalfx, ELK stack or other monitor tools experience are a plus