Mid to Senior Data Engineer
Description
Company Overview:
Lean Tech is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer many opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America and the United States, contributing to cutting-edge developments in multiple industries.
Position Title: Mid to Senior Data Engineer
Location: Remote - LATAM
What you will be doing:
In this pivotal role, you will be responsible for building and maintaining the core data
infrastructure that underpins our performance analytics ecosystem. Your primary focus
will be on designing, developing, and managing scalable ETL/ELT pipelines that
integrate data from multiple operational systems into our centralized Lakehouse, which
utilizes Microsoft Fabric and is transitioning to Snowflake. You will apply advanced skills
in Python, PySpark, and SQL to perform complex data transformations, cleanse data,
and build analytics-ready datasets within a Medallion architecture. A key aspect of this
position involves developing and managing sophisticated data models, including
star-schemas, to prepare optimized backend datasets for Power BI. Working in close
collaboration with BI analysts, developers, and performance managers, you will ensure
the integrity, governance, and reliability of our data assets. This position is central to our
organization's data-driven transformation, with your work directly supporting
enterprise-level dashboards and data pipelines that influence business operations for
thousands of employees and global clients.
● Design, develop, and maintain robust ETL/ELT pipelines to process daily batch
operational data from diverse sources into the centralized Lakehouse (Microsoft
Fabric / Snowflake), following the Medallion architecture.
● Utilize Microsoft Fabric components (Data Factory, Pipelines, Notebooks) and
Snowflake features (Tasks, Streams) for workflow orchestration and data
transformation.
● Implement complex data transformations, cleansing routines, and API
integrations using strong SQL and Python/PySpark skills to build and manage
analytics-ready data models, schemas, and metadata.
● Develop and prepare backend data models for Power BI, focusing on
star-schema design, performance tuning, and configuring incremental refresh
strategies to support BI analysts.
● Enforce data quality and integrity by implementing data validation and monitoring
processes, while supporting governance practices using Microsoft Fabric’s
built-in data catalog and lineage tracking tools.
● Optimize data pipelines for performance, scalability, and cost-efficiency, ensuring
reliable and timely data delivery for analytical use cases.
● Collaborate with BI analysts and performance management teams to prepare
clean, governed datasets for enterprise dashboards, predictive analytics, and
machine learning applications.
● Implement data security and privacy best practices in alignment with company
standards, utilizing version control (Git) and agile development methodologies.
Requirements & Qualifications
To excel in this role, you should possess:
● +3 years of experience in data engineering within an enterprise environment, Bachelor’s degree in Data Engineering.
● Advanced proficiency in Snowflake, including proven experience with workflow
orchestration using Tasks and Streams.
● Advanced, proven experience with Microsoft Fabric, specifically utilizing Data
Factory, Dataflows Gen2, Pipelines, Notebooks, and the Lakehouse with Delta
tables for data transformation and orchestration.
● Advanced expertise in designing, developing, and maintaining robust ETL/ELT
pipelines for batch processing of large-scale operational data from multiple
sources.
● Strong, advanced-level skills in SQL and Python/PySpark for complex data
transformation, automation, API ingestion, data cleansing, and preparing
analytics-ready datasets.
● Advanced data modeling skills, with a strong understanding of data warehousing
concepts, dimensional modeling, and star-schema design for analytical use
cases.
● Experience supporting Power BI through backend dataset preparation, including
star-schema design, performance tuning, and configuring incremental refresh
strategies.
● Proficiency in data integration using APIs, including REST, JSON, and Bearer
token authentication.
● Experience implementing data governance practices, with knowledge of data
cataloging, lineage tracking, and impact analysis, particularly within the Microsoft
Fabric ecosystem.
● Familiarity with version control systems such as Git and agile development
methodologies.
● Strong collaboration and communication skills for effective partnership with BI,
operations, and IT stakeholders.
Nice to Have
● Experience with advanced Power BI development, including writing DAX queries
and performing semantic model optimization.
● Familiarity with machine learning workflows and preparing data for model
deployment.
● Knowledge of performance metrics within the Business Process Outsourcing
(BPO) or logistics industries.
● Experience with other major cloud data platforms, such as Google BigQuery,
Azure Synapse Analytics, or Azure Data Factory.
● Relevant industry certifications in data engineering or cloud platforms from
providers like Microsoft or Snowflake.
Soft skills:
● Effectively partner with BI analysts, developers, performance managers, and IT
stakeholders to ensure data solutions are aligned with business requirements
and to support the preparation of analytics-ready datasets.
● Apply strong analytical skills to design, develop, and optimize complex ETL/ELT
pipelines, addressing challenges related to performance, scalability, and data
quality.
● Demonstrate a meticulous approach to implementing data validation, cleansing,
and monitoring processes, ensuring the delivery of clean, governed, and reliable
data for enterprise use.
● Proactively stay current with new data engineering technologies and agile
development methodologies, bringing forward recommendations for improvement
to our data infrastructure and practices.
Why you will love Lean Tech:
● Join a powerful tech workforce and help us change the world through technology
● Professional development opportunities with international customers
● Collaborative work environment
● Career paths and mentorship programs that will lead to new levels.
Join Lean Tech and contribute to shaping the data landscape within a dynamic and
growing organization. Your skills will be honed, and your contributions will be vital to our
continued success. Lean Tech is an equal opportunity employer. We celebrate diversity
and are committed to creating an inclusive environment for all employees.