Data Engineer
Description
Data Engineer
This role will suit an all-round Data Engineer experienced in every step of data flow. Starting from configuring data sources to integrating analytical tools. The data engineer is a technical person who will contribute to architecting, building, testing, and maintaining the data platform as a whole
Mission:
Deliver accurate data reliably within the required processing time, ready for processing by analytics applications.
You will work across multiple business teams to implement new and improve existing systems for:
- Extracting data from current sources
- Data storing/transition for all data gathered for analytical purposes.
- Transformation: Cleaning, structuring, and formatting the data sets to make data consumable for processing or analysis.
Responsibilities:
- Lead/contribute to designing the architecture of a data platform.
- Develop data systems tools, customize and manage integration tools, databases, warehouses, and analytical systems.
- Data pipeline maintenance/testing.
- Machine learning algorithm deployment. Machine learning models designed by our data scientists, deployed into production environments, managing computing resources and setting up monitoring tools.
- Manage data and meta-data storage, structed for efficiency, quality and performance.
- Track pipeline stability. Monitor the overall performance and stability of the systems.
- Keep track of related infrastructure costs and manage these as efficiently as possible, continuously find the balance between performance and cost.
Important characteristics:
- Dynamic, driven by results. Enjoy getting things done.
- A proactive, results-oriented, and can-do attitude is highly desirable
- Self-starter with the ability to successfully plan, organize, and execute assigned initiatives with minimal guidance and direction.
- Strong listening, problem-solving and analytical skills.
- Ability to conduct systems analysis and prepare requirement specifications concerning data-related business processes and systems.
- Willing to take the initiative.
- Excellent written and verbal communication skills.
- Exhibits close attention to detail and is a champion for accuracy.
- High level of Integrity.
- Proven ability to meet deadlines, ability to multi-task effectively
- The ability and willingness to do a variety of different tasks.
Preferred qualifications and experience:
- A Bachelor degree in computer science or engineering related field.
- 5-10 years relevant experience.
- Intermediate to advanced SQL optimization, developing ETL strategies.
- Intermediate to advanced knowledge of database and data warehousing principles (e.g. OLAP, Data Marts, Star Schema, lambda/kappa architectures)
- Knowledge of working with stream data pipeline frameworks or solutions, e.g. Apache Flink, Beam, Dataflow, Databricks etc. an advantage
- Knowledge of cloud data platforms is an advantage
- Experience within the Automotive or manufacturing industry is a plus.
- Experience with agile or other development methodologies - Agile, Kanban or Scrum
Preferred Technical experience:
- Implementing data pipelines using cloud infrastructure and services.
- Implementing event-based systems using tools like Confluent Kafka or Kinesis.
- Knowledge of CDC pipelines (E.g. Bin logs, Debezium, AWS Database Migration Service (DMS))
- API integration and development using Python - Fast API, Flask
- DevOps: Docker (ECS), Kubernetes (EKS), Spark clusters, etc.
- Database analysis, design & administration.
- SQL Query optimization and data architecture improvements;
- DB: PostgreSQL, MS-SQL
- ETL: Python, Bash
- Infrastructure: AWS (Kinesis, API Gateway, S3, DMS)
- Dev Tools: Git, Docker, Elastic Container Service (ECS), Elastic Kubernetes Service (EKS)
- OS: Ubuntu, Windows Server
Don't see your dream job?
Submit Your Resume