Tech Apprentice, Data Management

Job ID 2022-4596

TechnologyHybrid Remote, New York, New York


Position at WebMD

WebMD is the most recognized and trusted brand of health information and the leading provider of health information services, serving consumers, physicians, healthcare professionals, employers and health plans through our public and private online portals and WebMD the Magazine. The WebMD Health Network includes WebMD, Medscape, MedicineNet, eMedicine, RxList, and Medscape Education. Our consumer portals and mobile health applications provide engaging, relevant and credible health and wellness information, personalized health assessment tools and access to online communities.

WebMD is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status, sexual orientation, gender identity, national origin, medical condition, disability, veterans status, or any other basis protected by law.

Position Overview:

WebMD’s Data Management team is looking for an aspiring Data Engineer with a passion for data.  You don’t need to be a doctor to work with data at WebMD, but have to be able to diagnose and solve any data problems.  Our data can predict and forecast the cold and flu season better and faster than the CDC.  We use collaborative filtering to recommend articles you might like and personalize each newsletters.  If you’ve visited us, you’re probably in many of our look-alike models and waiting for you to revisit.

We support databases that can respond in milliseconds or aggregate billions of rows in seconds.  We process 300G of data every day and support data warehouses, data lakes, data marts.  We manage the data as well as the infrastructure that supports it. 

Congratulations, you’ve graduated or about to graduate.  It’s not easy, I’ve been there too.  You probably coded all night in Java or Python.  You probably worked days to get measurable results from your neural network models.  You probably used SQL in beeline and PySpark and got two different results that didn’t make any sense.  We go through the same problems! 

Must have:

  • Completed or is in last year in BS Computer Science or related degree
  • Tristate area, NY, NJ, CT, we are located between SoHo and West Village
  • Experience with SQL.  If you want to do anything with data, this is a must.
  • Experience with Python, Scala or Java
  • Experience with Hadoop ecosystem such as Hive, Tez, Sqoop, Ambari
  • Experience with Spark, especially PySpark 

Nice to have: (or look forward to learning)

  • Experience with MPP databases, Vertica, Snowflake, Redshift, etc
  • Experience with streams, Kafka or Spark Streaming
  • Experience with ETL tools such as Talend, Pentaho, Glue
  • Understanding of Web analytics
  • Understanding of Online advertising world
  • Love coffee!  We have a full barista in the office!
Compensation: $20/hr