Python Data Engineer

02 Data Management Tysons Corner, Virginia


Description

Our client’s investments & Capital Markets Division is currently seeking a Senior Data Engineer consultant who enjoys data and building data storage platforms from ground up. The ideal candidate has a passion for data analysis, technology and helping people leverage the technology to transform their business processes and analytics. As a Data Engineer, you will be part of a team responsible for supporting a wide range of internal customers. You will draw on all the skills in your toolkit to analyze, design, and develop data storage and data analytic solutions using data lake patterns, that help our customers run more effective operations and make better business decisions. 

Your work falls Into two primary categories: 

Strategy Development and Implementation 

  • Develop data filtering, transformational and loading requirements
  • Define and execute ETLs using Apache Spark on Hadoop among other Data technologies
  • Determine appropriate translations and validations between source data and target databases
  • Implement business logic to cleanse & transform data
  • Design and implement appropriate error handling procedures
  • Develop project, documentation and storage standards in conjunction with data architects
  • Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.
  • Create and automate ETL mappings to consume loan level data source applications to target applications
  • Execution of end to end implementation of underlying data ingestion workflow. 

Operations and Technology 

  • Leverage and align work to appropriate resources across the team to ensure work is completed in the most efficient and impactful way
  • Understand capabilities of and current trends in Data Engineering domain

Qualifications

  • At least 5 years of experience developing in Python
  • At least 4 years of experience in developing Apache Spark applications
  • Bachelor’s degree with equivalent work experience in statistics, data science or a related field.
  • Experience working with different Databases and understanding of data concepts (including data    warehousing, data lake patterns, structured and unstructured data)
  • 3+ years’ experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.
  • Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR).
  • Implementing AWS services in a variety of distributed computing, enterprise environments.
  • Experience writing automated unit, integration, regression, performance and acceptance tests
  • Solid understanding of software design principles 

Key to success in this role 

  • Strong consultation and communication skills
  • Ability to work with and collaborate across the team and where silos exist
  • Deep curiosity to learn about new trends and how to do things better
  • Ability to use data to help inform strategy and direction 

Top Personal Competencies to possess 

  • Seek and Embrace Change – Continuously improve work processes rather than accepting the status quo
  • Growth and Development – Know or learn what is needed to deliver results and successfully compete 

Preferred Skills

  • Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro).
  • Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce
  • Experience with Columnar databases like Snowflake, Redshift
  • Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc.)
  • Experience with building production web services
  • Experience with cloud computing and storage services
  • Knowledge of Mortgage industry 

About RiskSpan

RiskSpan is a product as well as a management consulting firm, a leading source of analytics, modeling, data and risk management for the Consumer and Institutional Finance industries. We solve business problems for clients such as banks, mortgage-backed and asset-backed securities issuers, equity and fixed-income portfolio managers, servicers, and regulators that require our expertise in the market risk, credit risk, operational risk and information technology domains. Our focus is on fostering a high -performance culture with work life balance, one that develops a top-notch talent pool with the skills and determination to deliver above and beyond.