Tech Ops-Site Reliability Engineer
Description
- Own Splunk Cloud in Microsoft Azure environments and Amazon AWS FedRAMP
- Work across the organization to deliver quality products that delight Splunk's passionate users.
- Lead teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
- Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
- Must attain Splunk Cloud Certified Architect, within the first 12 months of hire date.
- You have experience or an interest in working with regulated computing environments such as FISMA and/or FedRAMP and are enthusiastic about doing it better.
- Experience working within an Azure environment
- Experience working in a fully remote position and team
- You are passionate about building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
- You constantly consider "How can I automate this process?"
- Knowledge of best practices related to security, performance, and disaster recovery.
- Skilled in identifying performance bottlenecks, spotting anomalous system behavior, and determining the root cause of incidents.
- Experience monitoring cloud environments using tools like Splunk, VictorOps and Nagios
- You care about good documentation and appreciate how it allows a distributed team to function.
- Ability to tackle complex problems, resolve operational issues, and interact with vendors to find solutions.
- Comfortable working with critical, customer-facing issues and able to prioritize quickly when escalations happen.
- Deep understanding of linux systems or equivalent certification, (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
- Must have AZ-900 Azure Fundamentals or preferred AZ-104 Azure Administrator Certification
- You've demonstrated the skills to effectively work across teams and functions to influence design, operations and deployment of highly available software.
- You are interested in working hard to make the users of Splunk's products happier every day.
- Ability to to work nights, weekends and On-Call
- Experience monitoring cloud environments with Splunk.
- Experience with at least one programming language, preferably golang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services. Knowledge of common data structures and algorithms, as well as their performance characteristics is required.
- Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
- Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes.
- Familiarity with Gitlab, Puppet, Jenkins, Clustering, Web Apps, and yaml
- Ability to support FedRAMP Moderate environments.
Splunk is an Equal Opportunity Employer: At Splunk, we believe creating a culture of belonging isn’t just the right thing to do; it’s also the smart thing. We prioritize diversity, equity, inclusion, and belonging to ensure our employees are supported to bring their best, most authentic selves to work where they can thrive. Qualified applicants receive consideration for employment without regard to race, religion, color, national origin, ancestry, sex, gender, gender identity, gender expression, sexual orientation, marital status, age, physical or mental disability or medical condition, genetic information, veteran status, or any other consideration made unlawful by federal, state, or local laws. We consider qualified applicants with criminal histories, consistent with legal requirements.
Thank you for your interest in Splunk!