Sr EDL Hadoop Data Engg or Sr Developer

Data EngineeringTemporarily Remote, Ontario, Canada Halifax, Nova Scotia


Description

Position at Ness Digital Engineering

Required Skills & Experience

Technical Expertise

  • Strong hands-on expertise in Hadoop ecosystem (HDFS, Hive, Spark, Oozie, Yarn, HBase, Kafka, Zookeeper).
  • Deep understanding of data ingestion, transformation, and storage patterns in large-scale environments.
  • Experience with distributed computing, data partitioning, and parallel processing.
  • Proficiency in SQL, PySpark, Scala, or Java.
  • Familiarity with cloud-native data lakes on AWS (EMR, Glue, S3), Azure (HDInsight, ADLS, Synapse), or GCP (Dataproc, BigQuery).
  • Knowledge of data governance tools (Apache Atlas, Ranger, Collibra) and workflow orchestration tools (Airflow, Oozie).
  • Expertise in Data Warehousing and ETL processes, including Design, Development, Support, Implementation, and Testing.
  • Hands on exp in Architecture, design including requirement analysis, performance tuning, data conversion, loading, extraction, transformation, and creating job pipelines.
  • Strong knowledge of the Retail Domain and experience with various stages of data warehouse projects, including data extraction, cleansing, aggregation, validation, transformation, and loading.
  • Exp in using DataStage components such as Sequential File, Join, Sort, Merge, Lookup, Transformer, Remove Duplicates, Copy, Filter, Funnel, Dataset, Change Data Capture, and Aggregator.
  • Strong at database commands (DDL and DML) and data warehousing implementations models.
  • Hands on exp with the Hadoop ecosystem, including HDFS, Hive, Sqoop, NiFi, and YARN.
  • Familiar with Mainframe ESP for job scheduling.
  • Implementation exp in  indexes, table partitioning, collections, analytical functions, and materialized views. Created and managed tables, views, constraints, and indexes.
  • Experienced in CI/CD processes using Jenkins and SourceTree.
  • Proficient with ServiceNow, Confluence, Bitbucket, and JIRA.

Preferred Qualifications

  • Experience integrating EDL with modern lakehouse platforms (Databricks, Snowflake, Synapse, BigQuery).
  • Understanding of machine learning pipelines and real-time analytics use cases.
  • Exposure to data mesh or domain-driven data architectures.
  • Certifications in Hadoop, Cloudera, AWS, or Azure data services.