Data Engineer

Software Engineering New York, New York


About Your Role:

Dotdash is looking for a Data Engineer with strong Python coding skills and data warehousing experience to help lead our data infrastructure process. This person will work closely with various teams that include Growth & Content applications development, data operations, data science and content quality teams to build new ETL pipelines and warehousing solutions alongside improving existing architectures. We are looking for someone who is passionate about working with data and can take ownership of building robust data pipelines. The person will often need to analyze different data entities to ensure the data is clean and that data integrity persists from the time the data enters our ETL pipelines all the way through to the end user.

About Your Contributions:

  • You will drive the advancement of the Growth & Content team's complex data stores and ETL infrastructure by designing and implementing the underlying logic and structure for how data is set up, cleansed, and ultimately stored for organizational usage.
  • You will define, document and evangelize data usage and best practices across the team.
  • You will scope points of failure and create alerts to ensure continual data cleanliness and accuracy.
  • You will write code and automate builds to connect to external data stores(within the company and externally) to pull and store data for organizational use.
  • By being one of the point persons between the Growth & Content engineering team and other teams, you will interact and work closely with end users, application devs, the data operations team and the data science team
  • You will write excellent code to efficiently crawl millions of pages belonging to numerous sites, parse the pages crawled and store the data in a clean and structured manner in our data warehouse
  • You will research, analyze and implement new tech stacks and data pipelining architectures in order to optimize our data warehousing

About You:

  • You have 2-3 years of experience working with back-end data systems to create clean data structures for use across the organization.
  • You have experience writing SQL queries to manipulate data. Window functions and nested subqueries are second nature to you.
  • You’re proficient at writing scripts with Python.
  • You love solving a good challenge. Taking a large problem, deconstructing it to make a plan of attack, and ultimately executing on that plan is something that excites you.
  • You have an interest in learning more about content quality and growth, big data technologies, data science projects etc.
  • You’re comfortable communicating and presenting complex topics and plans to people in the organization, regardless of their technical skill level.
  • You feel at home with the AWS infrastructure, including the use of S3, EC2, IAM and Redshift.

About Us:

For more than 20 years, Dotdash brands have been helping people find answers, solve problems, and get inspired. We are one of the top-20 largest content publishers on the Internet according to comScore, a leading Internet measurement company, and reach more than 30% of the U.S. population every month. Our brands collectively have won more than 20 industry awards in the last year alone and, most recently, Dotdash was named Publisher of the Year by Digiday, a leading industry publication. Our brands include Verywell, The Spruce, The Balance, Investopedia, Lifewire, Byrdie, MyDomaine, TripSavvy and ThoughtCo.

Dotdash embraces inclusivity and values our diverse community. We are committed to building a team based on qualifications, merit and business need. We are proud to be an equal opportunity employer and do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.