View job on Handshake

As a data engineer at Health at Scale, you will work with an exceptional team of engineers, scientists, and clinicians to design, launch, and maintain scalable, secure, high-availability, storage-efficient data ecosystems for real-world production use, supporting offline and real-time analytic pipelines for millions of users. You will play a core role in advancing the company’s data fabric to ensure high-quality data streams for enterprise use cases.

Responsibilities

  • Partner with scientific, engineering and clinical teams to understand, define and develop enterprise data workflows, pipelines and tools
  • Identify opportunities to advance data pipelines/processes and productionize data infrastructure for scalable, secure, high-availability, and storage-efficient use
  • Triage and correct data quality issues
  • Continue to build and maintain data knowledge detailing data provenance and data transformations (e.g., active metadata, knowledge graphs)
  • Build and enhance data pipelines and data warehouses for machine intelligence

Requirements

  • BS, MS or PhD in Computer Science or related technical field
  • 2+ years of experience with data engineering in industry or academia
  • Strong proficiency in Python (preferred), Java or C/C++
  • Experience in cloud computing, parallel/distributed computing, data warehousing, workflow management and ETL
  • Familiarity with data tools and techniques for dealing with missing, erroneous or unusual values in data
  • Excellent communication skills

Health at Scale is an equal opportunity employer and is committed to diversity in its hiring and business practices. To all recruitment agencies: Health at Scale does not accept agency resumes. Please do not forward resumes to this job alias, company employees or any organization location. Health at Scale is not responsible for any fees related to unsolicited resumes.