Senior DevOps Engineer
Description
Company Overview:
Lean Tech is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer many opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America and the United States, contributing to cutting-edge developments in multiple industries.
We are seeking a highly skilled and experienced engineer to join our platform team and provide leadership in designing, scaling, and optimizing our SaaS platform infrastructure, ensuring reliability, performance, and seamless operations across the system.
Position Title: Senior Platform/DevOps Engineer
Location: Remote - LATAM
What you will be doing:
We are seeking a highly skilled and experienced engineer to join our team, you will design, optimize, and scale the infrastructure that powers our AI-focused SaaS platform for enterprise customers. You will lead the implementation and maintenance of AWS infrastructure using Infrastructure as Code, manage containerized applications on AWS ECS and Kubernetes, and design deployment strategies with Helm charts, operators, and GitOps workflows to ensure high availability, scalability, and efficient resource utilization. You will implement and maintain API gateway integrations to support seamless metering and monetization of AI/API traffic, optimize CI/CD pipelines for rapid, reliable deployments, and establish monitoring, logging, and alerting strategies to ensure observability and rapid incident resolution. Additionally, you will enhance system performance across distributed architectures including Flink, Kafka, and Redis, implement security best practices, and develop automation tools to streamline operational workflows. Collaborating closely with cross-functional engineering teams, you will have significant ownership in evolving platform architecture, driving reliability, performance, and innovation at the intersection of AI and usage-based financial technology.
- Design, implement, and maintain AWS infrastructure using Infrastructure as Code (Terraform) to support our multi-tenant SaaS platform across development, staging, and production environments.
- Manage and optimize containerized applications on AWS ECS and design Kubernetes deployment strategies, including Helm charts, operators, and GitOps workflows.
- Ensure efficient resource utilization and auto-scaling for JVM-based Spring Boot applications and Node.js services, and assist with migrating and implementing Kubernetes clusters for core platform services.
- Implement and maintain API gateway integrations (Azure, AWS, MuleSoft, Kong, Gravitee, Envoy) to support metering and monetization of AI/API traffic, collaborating with engineering teams on plugin and policy deployment.
- Enhance and maintain CI/CD pipelines (CircleCI) to enable rapid, reliable deployments with zero downtime, including rolling updates, blue-green deployments, and automated rollback strategies.
- Implement comprehensive monitoring, logging, and alerting using CloudWatch, OpenTelemetry, and other tools to ensure rapid incident detection and resolution.
- Optimize system performance across distributed architectures, including Apache Flink, Kafka streaming (MSK), and Redis caching layers, ensuring high availability and disaster recovery.
- Implement and maintain security best practices, including network isolation, secret management, and runtime security scanning, while supporting compliance requirements for enterprise customers.
- Develop automation scripts and tools to streamline operational tasks, reduce manual effort, and improve overall team productivity.
- Collaborate with cross-functional teams to evolve distributed systems, API monetization frameworks, and platform architecture, taking ownership of high-impact technical initiatives.
Requirements & Qualifications
To excel in this role, you should possess:
- 5+ years of professional experience designing, implementing, and maintaining cloud infrastructure (preferably AWS) for SaaS platforms.
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
- Extensive experience
- Strong proficiency with Infrastructure as Code tools, such as Terraform, for multi-environment deployments.
- Deep experience with containerization and orchestration (Docker, Kubernetes), including deployment strategies and cluster management.
- Expertise in managing and scaling distributed systems, microservices architecture, and API-driven platforms.
- Proven experience designing and maintaining CI/CD pipelines with automated testing, deployment strategies, and rollback procedures.
- Strong knowledge of monitoring, logging, and observability tools (e.g., CloudWatch, OpenTelemetry) for high-availability systems.
- Experience optimizing system performance across complex architectures, including messaging systems (Kafka), streaming pipelines (Flink), and caching layers (Redis).
- Solid understanding of security best practices, including network isolation, secret management, runtime security, and enterprise compliance requirements.
- Proficiency in automation scripting to improve operational efficiency and reduce manual intervention.
Nice to Have
- Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
- Experience with GitOps tools (ArgoCD, Flux)
- Experience with service mesh technologies (Istio, Linkerd)
- Experience with Apache Flink for stream processing
- Familiarity with eBPF for system-level monitoring and security
- Knowledge of AI/ML infrastructure and GPU workload management
- Experience with vector databases (Pinecone, Weaviate, etc.)
- Understanding of FinOps principles and cost optimization strategies
- Experience with multi-tenant SaaS architectures
- Previous experience with usage-based billing or metering systems
- AWS certifications (Solutions Architect, DevOps Engineer)
- Experience with Gradle build systems
- Familiarity with Kotlin and/or Python
- Understanding of HATEOAS and REST API best practices
- Experience with payment processing systems (Stripe integration)
- Contributions to open-source infrastructure projects
Soft skills
- Good English communication skills (written and verbal) are a must.
- Transparent and proactive communicator, especially in reporting blockers or status.
- Self-sufficient and able to deliver tasks with minimal supervision.
- Effective Communication: Articulate complex technical concepts clearly and transparently, facilitating smooth collaboration within the team and with stakeholders.
- Problem Solving: Proactively identify challenges and implement solutions, demonstrating a strong sense of ownership and accountability for deliverables.
- Team Collaboration: Work harmoniously with team members, fostering a respectful and inclusive environment that values diverse perspectives.
- Adaptability: Thrive in a fast-paced, evolving environment, efficiently managing priorities and embracing new technologies and processes.
- Integrity: Uphold ethical principles and honesty in all interactions, aligning with company values and culture
Why you will love Lean Tech:
- Join a powerful tech workforce and help us change the world through technology
- Professional development opportunities with international customers
- Collaborative work environment
- Career paths and mentorship programs that will lead to new levels.
Join Lean Tech and contribute to shaping the data landscape within a dynamic and growing organization. Your skills will be honed, and your contributions will be vital to our continued success. Lean Tech is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.