Site Reliability Engineer - Budapest
Ekata provides global identity verification via enterprise-grade APIs and a SaaS solution. Our product suite is powered by Ekata Identity Engine, the first and only cross-border identity engine of its kind. It uses complex machine learning algorithms across the five consumer attributes of email, phone, name, physical address, and IP to derive unique data links and features from billions of real-time transactions within our customer network and the globally sourced data of our graph. Businesses around the world including Alipay, Stripe, Airbnb, and Microsoft leverage our solutions to approve more good transactions, reduce friction, and find fraud.
The Site Reliability Engineer manages our production environment, providing a highly available and scalable platform for Ekata to serve our customers. The Infrastructure team provides a resource for Engineering to help diagnose production issues, and provide guidance on improving the availability and performance of our applications.This position also develops systems, automation and tools to help make it easier for Engineering teams to deploy services in a fast, automated and reliable fashion.
In the Site Reliability Engineer role you will:
- Build, scale and support high availability Ubuntu Linux production and development systems in a public cloud environment
- Develop and deploy tools and automation to replace manual tasks and improve efficiency
- Improve security practices and procedures with Infrastructure team
- Manage Kubernetes clusters for container orchestration and AWS automation
- Collaborate with Engineering to help them deploy systems that are highly available, secure and performant
- Ensure methods are well defined for backing up critical data. Work with Engineering teams to make sure backups are taking place
- Participate in oncall rotation
- Manage load balancing platforms
- Manage security and availability monitoring for all services
- Maintain quality documentation for systems owned by the Infrastructure team
- Use monitoring tools to identify and resolve issues before they happen
- Help other teams troubleshoot and solve failures and performance problems
- Ensure security policies and procedures are consistently implemented to secure production data
- Participate in code reviews with the Infrastructure team
- Ensure the development and maintenance of standards and procedures that result in an environment compliant with information security policy
Our ideal Site Reliability Engineer will have:
- At least some experience with Ansible or other configuration management tools.
- Proven skills with Linux or UNIX systems and related protocols/software with 3+ years’ experience
- A command of Linux systems including troubleshooting, memory management, tuning, I/Osubsystem, RAID, and security
- Experience with Jenkins or other CI/CD tools
- Programming aptitude in Ruby, Python, Go, etc.
- Experience with monitoring solutions such as Nagios, Prometheus, or Zabbix
- Working knowledge of database systems such as MySQL or PostgreSQL
- A 4 year engineering degree or equivalent
- Experience with Docker
- Excellent written and spoken English skills
Unwavering in our pursuit of standardizing global identity data, we are approachable, real people that genuinely care about the success of those we partner with. With a commitment to service, innovation, and ownership, Ekata is a dynamic place to work for folks who want to make an impact on a global scale. We provide a learning & development opportunities for each employee and promote work-life flexibility through self-managed time off. Headquartered in downtown Seattle, Ekata is growing internationally with offices in Budapest, Hungary, Amsterdam, and soon in Singapore.
To learn more about the experience of working at Ekata, visit: https://ekata.com/careers/.
Ekata prides itself on celebrating diversity, inclusivity, and being an equal-opportunity employer.