Description
Formation Bio is seeking a Senior Data Engineer to join the Scientific Data Intelligence (SDI) team. The successful candidate will help transform Real World Data (RWD) into structured, analytics-ready assets.
Responsibilities:
- Model and transform raw EHR and claims data into clean, canonical, and analytics-ready datasets using SQL, Python, and clinical standards like OMOP.
- Build and manage scalable data pipelines using Dagster for orchestration, dbt for transformation, and Snowflake as the primary compute and storage engine.
- Conduct hands-on RWD analyses to answer scientific and strategic research questions.
- Partner with Data Scientists and clinical leads to design and execute observational studies.
- Implement data validation, completeness, and observability frameworks.
- Apply Generative AI techniques within transformation and analysis layers.
- Communicate findings clearly to both technical and non-technical stakeholders.
Requirements:
- 5+ years of experience in data engineering, ideally with at least 2 years working in healthcare or life sciences.
- Experience with ontologies and biomedical schemas (e.g. UMLS, LOINC, ICD9/10, MeSH).
- Fluency in SQL and Python, and experience building and maintaining production-grade pipelines.
- Experience building longitudinal patient cohorts from EHR or claims data.
- Solid understanding of causal inference frameworks.
- Working familiarity with real-world evidence study design concepts.
- Hands-on expertise with modern data infrastructure, such as Snowflake, dbt, and Dagster.
Total Compensation Range: $204,500 - $267,000
This listing is enriched and indexed by YubHub. To apply, use the employer's original posting:
https://job-boards.greenhouse.io/formationbio/jobs/7757932