You will work side by side with the engineering, development, and platform teams and be primarily responsible for automation for the administration of our hybrid cloud services. You will bridge the gap between development, operations and infrastructure. This role will help facilitate tasks needed to keep the product delivery, availability, and performance optimal.
- BA/BS in CS or related field, or equivalent experience.
- 3+ year’s industry experience in a Cloud development and operations role.
- Strong code/scripting skills, python experience preferred.
- Demonstrated experience with server configuration management (infrastructure automation) and other tools (i.e. - Ansible, Puppet, Cloudformation/Terraform, Ansible, Packer, Istio, Prometheus).
- Knowledge of monitoring tools (i.e. - Nagios, like Datadog, New Relic etc).
- Experience with developing software with Java, Python, GoLang, etc.
- AWS or Azure cloud environments – Ex: EC2, S3, IAM, RDS, etc.
- Experience with developing and supporting containers and orchestration using Docker Swarm/ECS/Kubernetes.
- MySQL or PostgreSQL database administration.
- Experience with continuous monitoring, alerting using systems.
- Experience with web server configuration, monitoring, trending, network design, high availability.
- Solid understanding of fundamental technologies like TCP/IP, HTTP, and DNS.
- 5+ years AWS experience to include various functionalities such as EC2 setup, IAM integration, GuardDuty, CloudFormation, CloudWatch, and CloudTrail
- 5+ years experience with network design, implementation and optimization
- Experience with routing, firewalls, IPs, Ports, DNS, Certificates, CIDR, VPN’s, security, troubleshooting, and problem resolution
- Understanding of Service and Event mesh architectures
- Infrastructure as Code experience -Automate everything (, keep the configuration and coding as an automated and repeatable process, either when deploying resources to new environments or increasing the capacity of the existing system to cope with extra load);
- Understanding of distributed/versioned configurations, service registration and discovery, routing, service-to-service calls, load balancing, FaaS, Circuit Breakers, Global locks, Distributed messaging
- Have familiarity with backup and recovery software and methodologies
- Have familiarity with enterprise-grade data-center and cloud platform operations and production support
- Must be able to communicate effectively and clearly in verbal and written form
- AWS, Azure, or GC Certification is preferred
- Excellent executive level verbal and written communications and presentation skills
- Professional demeanor
- Work closely with developers in supporting new features and services.
- Scale cloud infrastructure to meet demand.
- Build tools to monitor site stability and performance.
- Debug production issues across services and levels of the stack.
- Respond to incidents and drive change that prevents the same issue from re-occurring.
- Document system design and procedures.