Tech Lead Application Operations Engineer
- Acts as a service domain expert. Understands the behaviors and functions of all applications and DB services used in the application stack.
- Perform detailed analysis of issues within the production and pre-production environments, provides recommendations and works with development teams on how to best deliver services for optimized operational support.
- Ensures Operational Supportability standards are being adhered to during development process; works with dev teams to fulfill those requirements.
- Assists the Technical Support and Operations (NOC) teams with customer escalations and operational issues, providing insight in advanced questions and solutions. Acts as a point of escalation for both teams for product-specific issues. Trains other teams on core support and operational tasks that can be done to provide the operational support in 24/7 environment.
- Creates bug reports and application optimization recommendations for the Engineering teams.
- Responds to infrastructure escalations impacting the production environment.
- Root-cause complex problems and involve multiple stakeholders, network, hardware and software that relate to scaling and performance
- Ensure proper monitoring, alerting, capacity planning and reporting in the production environment.
- Works closely with Engineering, QA, DevOps, and Network Operations teams.
- Automates the manual processes and optimizes the service delivery of products offered to customers.
Skills required for an effective SREs:
- An ideal candidate will have a combined skill set of technical aptitude, superior troubleshooting, and excellent communication/collaboration skills
- Knowledge and proficiency with a variety of Ops and Automation tools
- Strong troubleshooting skills with the ability to analyze data from multiple sources (logs, system, application)
- Great at writing scripts for automation
- Strong experience working in public cloud environments (AWS) and working knowledge of troubleshooting issues with AWS services
- Basic to intermediate understanding of networking concepts (subnets, IP routing, ACLs)
- Capable of writing SQL queries and performing data analysis against various data sources (SQL, DynamoDB)
- Experience using application monitoring systems (BMC, CloudWatch)
- Strong people/communication skills that allow SRE to interact with in person and remote teams.
- History of documenting complex systems (network diagrams, bug reports, application specifications)
NICE is committed to provide an environment based on equal opportunity for all qualified applicants and employees. It is the policy of NICE to afford equal employment opportunities to qualified individuals, regardless of age, race, color, creed, religion, citizenship, ancestry, national origin, sex, gender, pregnancy, mental or physical disability, marital status, veteran status, service in the Armed Forces, sexual or affectional orientation, atypical hereditary cellular or blood traits, genetic information, status as a victim of domestic or sexual violence, and/or any other status protected by any applicable federal, state and/or local statute or regulation.