Data Engineer

Engineering Porto, Portugal


The role

We are looking for a Software Engineer to focus on data pipelines, assembling large and complex data sets that meet functional and non-functional business requirements on a next-generation platform. One that is motivated to evolve our e-commerce platform, by defining its future, building a flexible and high-performant top-notch tech platform for machine learning-based services

You will be working in a very friendly environment and be part of a well-motivated, multicultural, talented and growing team of Software Engineers, QAs, Data Scientists and Data Analysts, to help build and optimize our data-driven products, through research and experimentation in a big data context.

If you love to learn and expand your competencies in a data-driven philosophy, if you are willing to share knowledge and would love to be part of the building process to reach the top, using the latest technology stack and having fun doing it, this is an opportunity you can’t miss.

What you’ll do

  • Design and build scalable & reliable data processing pipelines for our machine learning-based services using state-of-the-art technologies (Cassandra, Apache Beam, Apache Spark and Hadoop ecosystem, Apache Kafka, Elasticsearch, MongoDB, etc.);
  • Constantly evolve data models & schema design of our online (interactive) and batch platforms;
  • Work with the team to set and maintain standards and development practices.

Who you are
  • A Python developer with a focus on data processing libraries (e.g., Pandas, Scikit-learn, Gensim, Keras, Airflow, TensorFlow) but open to eventually use other languages and platforms;
  • Experienced in designing and running batch/stream data pipelines; 
  • Have worked with cloud-based data engineering platforms such as Google Cloud Platform (e.g. Cloud Dataflow, Cloud Dataproc, Cloud Pub/Sub, etc.) , Azure (e.g. Data Bricks, Data Factory, HDInsight, Stream Analytics, Data Lake Storage, etc.) but not limited to them (not mandatory);
  • Comfortable to deal with trade-offs involving latency, throughput, transactions;
  • A person that stays on top of all the best practices of modern software and data engineering;
  • You are a keen advocate of quality and continuous improvement;
  • Someone who is autonomous and able to make important technical decisions that will have a positive impact on our platform;
  • Someone interested in large-scale systems and passionate about solving complex problems;
  • Not afraid of failing, because we challenge assumptions and push boundaries since we have a culture of experimentation and learning from our mistakes so that we continuously improve!