Data Scientist

Research, Development & Cloud Operations Minneapolis, Minnesota Broomfield, Colorado

As a leading provider of global information security solutions, Code42 secures the ideas of more than 50,000 organizations worldwide, including the most recognized brands in business and education. Because Code42 collects and indexes every version of every file, the company offers security, legal and IT teams total visibility and recovery of data wherever it lives and moves. Founded in 2001, the company is headquartered in Minneapolis, Minnesota, with offices in London, Munich, San Francisco, Denver and Washington, D.C. We are proud to be funded by Accel Partners, JMI Equity, NEA and Split Rock Partners.

Code42 is committed to providing all employees with engaging and challenging work, opportunity for growth, an equal voice to drive innovation, and an environment that cultivates authenticity. In return, we look for people who are inquisitive, enjoy solving complex problems, collaborate effectively, think creatively and provide diverse insights to help us all think better and differently. Come join us and #BeCode42 


  • Producing new and creative analytic solutions that will become part of the Code42 SaaS products.
  • Developing machine learning models to drive new features.
  • Evaluating the predictive and operational performance of different machine learning approaches.
  • Creating stories and to influence next generation product features.
  • General data wrangling, analysis, and reporting for internal research purposes.
  • Generating and presenting reports and analysis of model quality.
  • Coordinating with company stakeholders to implement models and monitor outcomes.
  • Collaborating with teammates across the product development organization.


  • Typically has a bachelor's degree and 3 or more years of professional experience, or can convincingly demonstrate this level of skill.
  • Proficiency with statistical programming languages such as R or Python.
  • Experience with distributed data systems such as Apache Hadoop, Apache Spark or Apache Flink
  • Expertise in analyzing and modeling large, high-dimensional datasets.
  • Professional experience with machine learning techniques and how to apply them in real-world scenarios.
  • Ability to programmatically gather and manipulate datasets in multiple formats such as JSON, Parquet, or CSV.
  • Strong problem solving skills with an emphasis on product development.
  • Good verbal and written communication skills to present findings.
  • Driven to learn new technologies, statistical methods and data manipulation techniques.


  • Experience with Amazon Web Service data storage services such as Relational Database Service (RDS), RedShift, S3, or Athena.
  • Familiarity with the Amazon Web Service analytics tools such as Elastic Map Reduce (EMR), Amazon QuickSight, or Kinesis Data Analytics.
  • Familiarity with Agile / Scrum development practices and "Software as a Service" solutions.