Data Engineer - Big Data

Engineering Porto, Portugal


Description

THE ROLE

We are looking for a person who will be part of our Business Intelligence & Analytics team, this position will be in charge of the development of high performance, distributed computing tasks using Big Data technologies such as Hadoop, NoSQL and other distributed environment technologies based on the needs of the organization. You will also be responsible for analyzing, designing, programming, debugging and modifying software enhancements and/or new products used in distributed, large scale analytics solutions.


WHAT YOU'LL DO

  • Design and develop highly scalable, end to end process to consume, integrate and analyze large volume, complex data from sources such as Hive, Flume, Kafka or Storm;
  • Provide Data Engineering expertise to multiple teams across our organization. Provide guidance and support to software engineers with industry and internal data best practices;
  • Build fault tolerant, adaptive and highly accurate data computational pipelines. Tune queries running over billion of rows of data running in a distributed query engine;
  • Research and implement new data technologies as needed;
  • Work with other teams to understand needs and provide solutions;
  • Find innovative solutions through a combination of creative thinking and deep understanding of the problem space;
  • Work with the Business Intelligence development team on migration and improve existing SQL Server-based ETLs to Map Reduce and Hive (Cloud) technology to achieve scale and performance;
  • Help define and implement new processes on the data warehouse platform and work closely with Data Scientists to transform big data into model-­‐ ready forms to support analytic projects.

 

WHO YOU ARE

  • Experienced in working with large data sets (both structured and unstructured) using technologies such as MapReduce, Hadoop, HBase, Hive, Spark and NoSQL technologies;
  • Strong at programming background with languages such as Java, C++, or Python;
  • Knowledge in distributed systems;
  • A professional with background in working in cloud environments – AWS, Rackspace, Azure, etc;
  • Experienced with real-time analysis of sensor and other data from Internet of Things (IoTs) or other connected devices is a plus; 
  • Excellent in grasping of algorithmic concepts in computer science (e.g., sorting, data structures, etc.);
  • Experienced in the design, development and release of enterprise scale applications;
  • Experienced with version control;
  • A team worker with analytical and creative problem solving abilities.