Data Engineer

Information Technology Oklahoma City, Oklahoma


Job Summary

The Data Engineer is an IT professional with the primary focus being on all things data.  The ideal candidate enjoys working closely with the business to design data tools and solutions that enhance Continental’s analytics capabilities. This role will work closely with Continental Data Scientists, business SME’s, and IT professionals to architect, model, expose, optimize, and expand data consumption across Continental’s big data infrastructure.  As a Data Engineer you will be expected to help recommend, and sometimes implement, ways to improve data reliability, efficiency, and data quality.  This will be done by utilizing a variety of languages and tools to marry systems together or try to hunt down opportunities to acquire new data from other systems so that system-specific codes, for example, can become “information” in further processing by Data Scientists.

Duties/Responsibilities

  • Create, maintain, and optimize data pipelines.
  • Develop and implement a data management strategy for the life of asset.
  • Break down business problems into solvable components to recommend solutions.
  • Rapid, hands-on development of new data and analytics prototypes.
  • Create data tools to efficiently access data from multiple sources.
  • Optimize SQL queries used in analytics and machine learning models as they are deployed into production.
  • Perform root-cause analysis on internal and external data and systems to explain variances in analytics and failures of models, while recommending and implementing improvements to remediate future occurrences of common issues.
  • Coordinate closely with other IT professionals to manage architecture and platforms.
  • Research, experiment, and utilize leading Big Data methodologies, such as Hadoop, Spark, Redshift, SAP HANA, and Amazon Web Services.
  • Translate advanced business analytics problems into technical approaches that yield actionable recommendations, across multiple, diverse domains; communicate results and educate others through design and build of insightful visualizations, reports, and presentations.
  • Collaborate with customers, application owners, vendors and other team members as necessary for projects, tasks and customer support.
  • Perform advanced troubleshooting and root cause analysis to expedite incident resolution and when necessary architect new scalable data storage solutions.
  • Develop skills in business requirement capture and translation, hypothesis-driven consulting, work stream and project management, and client relationship development

Skills

  • Strong communication skills, both written and oral
  • Working understanding of Hadoop (preferably Cloudera) and the associated toolsets:
    • Hive (Internal/External) formatted at Parquet, HBase, Kudu, Impala, Spark, Sqoop, Oozie, Solr, Cluster/Design Management, Navigator, Sentry, and Cloudera Manager
  • Proven working experience as a data analyst or business data analyst
  • Technical expertise regarding data models, database design development, data mining and segmentation techniques
  • Strong knowledge of and experience with reporting packages (Business Objects, Power BI, etc.) and databases (SQL, Oracle, etc.)
  • Knowledge of statistics and experience using statistical packages for analyzing datasets (Excel, SPSS, SAS, R Scripting, Python, etc.)
  • Adept at queries, reporting writing, and presenting findings

Qualifications

  • Bachelor’s degree from an accredited college/university in Computer Science, Computer Engineering, or related field and minimum four years of big data experience with multiple programming languages and technologies; Master’s with two years of relevant experience; or PhD with one year of relevant experience
  • Fluency in several programming languages such as Python, Scala, or Java, with the ability to pick up new languages and technologies quickly; understanding of cloud and distributed systems principles, including load balancing, networks, scaling, in-memory vs. disk, etc.; and experience with large-scale, big data methods, such as MapReduce, Hadoop, Spark, Hive, Impala, or Storm
  • Ability to work efficiently under Unix/Linux environment or .NET, with experience with source code management systems like GIT
  • Ability to work with team members and clients to assess needs, provide assistance, and resolve problems, using excellent problem-solving skills, verbal/written communication, and the ability to explain technical concepts to business people