Site Reliability Engineer
Join Absolute’s Cloud Engineering & Hosting Operations teams and be part of our new core infrastructure and cloud initiatives. Our team is building the foundation for the next generation of the company services on top of Kubernetes and AWS. This is a global high throughput set of services that process data for hundreds of millions of devices per day. Be part of architecting the core application stack for throughput and resilience at Absolute.
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. SRE ensures that all services - both our internally critical and our externally visible systems - have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.
Do you have a passion for our core mission:
- Solve challenges of scalability and efficiency at scale in cloud environments.
- Digging to the root cause of production and scalability issues and ensuring they are fixed for life.
- Automate all the things! Be passionate about automation, from infrastructure-as-code through to continuous integration and deployments.
- Building high availability, high resilience systems based on Linux and open source technologies.
- Security best practices through continuous monitoring, architecture, networking, and automation.
Accountabilities Will Include:
- Management and development of our production observability infrastructure.
- Support the build out and ongoing maintenance of core application components such as Kubernetes, Kafka, ElasticSearch and other highly scalable systems.
- Support the core Linux and on-call teams around incident management, investigations and remediations of production issues.
- Work with cloud engineering and product development teams to educate on sound operational practices.
- Work with and promote Observability at all layers of the infrastructure, from hardware to network to containers to application layers.
What You'll Need:
- Experience with configuration management systems Ansible and Puppet.
- Hands-on technical experience in Kubernetes orchestration.
- Comfort with frequent, incremental code testing and deployment.
- Experience with current observability tools such as Prometheus, Thanos, Grafana, Jaeger, etc.
- Ability to use a wide variety of open source technologies and cloud services;
- Experience with AWS.
- Solid network fundamentals and the building blocks of highly available cloud infrastructure such as load balancers, network filtering and security, proxies, service mesh architectures.
Why Work For Us:
Absolute is the new standard for endpoint visibility and control, delivering self-healing endpoint security, always-connected IT asset management, and continuous data visibility—both on and off the network. Unlike other endpoint security agent solutions that can be corrupted, compromised or deleted, Absolute can self-heal itself and other critical applications through our patented Persistence technology that is embedded in the firmware of over 1 billion endpoints. No other security company can make this claim.
Headquartered in Vancouver, Canada with international offices in Austin, Texas, Reading, UK and Ho Chi Minh City, Vietnam, we are a collaborative and innovative place to make your mark in the world of security. Our agile, high energy culture rewards top performance and the contributions of those passionate about our collective growth and success. We celebrate our wins in our large common areas where we hold engineering hackathons, end of quarter celebrations, and monthly socials. We believe in a good work / life balance which is reflected in our annual employee retreat where it’s all about friends and family. To learn more about Absolute, visit our website at www.absolute.com or visit our YouTube channel.
Absolute is an equal opportunity employer.