Data Engineer

Engineering Pune, Maharashtra


Company Description:

Infostretch is a digital-native professional services firm. By combining our in-depth experience with niche digital technologies, ready-made tools, frameworks, technologies, and partnerships, we help enterprises get digital right, the first time. Backed by leading private equity firms Goldman Sachs and Everstone Group, the company is trusted by leading Fortune 100 companies as well as emerging innovators to deliver solutions that work seamlessly across channels, leverage predictive analytics to optimize the software lifecycle, and support continuous innovation.

The Company has been Certified as Great Place to Work for consecutive years, thanks to its employee friendly culture and this has enabled consistent double digit growth.

Technical Skills:

  • Strong programming skills. Must be proficient in one of the following languages: Python / Scala / Java
  • Must have working knowledge of Pyspark, Panda Data Frames, SparkSQL etc.
  • Working knowledge of messaging and data pipeline tools like Apache Kafka, Amazon Kinesis
  • Must have experience developing APIs using frameworks like Flask/Django etc.
  • Experience with stream-processing systems: Apache Spark-Streaming, Apache Storm etc.
  • Experience working in open table / in-memory table formats for huge analytics dataset: Iceberg, Parquet, Arrow, AVRO etc.
  • Experience writing and understanding complex SQL queries
  • Experience with AWS cloud services, data pipeline and governance tools like AWS Glue.
  • Experience working with NoSQL databases like, Apache Solr, DynamoDB, MongoDB.
  • Experience working with data warehouse tools like AWS Redshift, Snowflake.
  • Must have worked with structured, semi-structured and unstructured large data sets from real time/batch streaming data feeds.