Projects

PheWeb β-Matrix Builder

[ A parallel ETL pulling PheWAS GWAS data from multiple PheWeb instances into a unified β-matrix. ]

Timeline2025 – Present
CompanyUM Medical School
RoleResearch Engineer
StackPython · pandas · scikit-learn

Overview

[ Headline summary. ]

[ Detailed overview — what was built, why. ]

Technologies

language
Python
data & ml
pandas NumPy scikit-learn
infrastructure
parallel ETL requests
data sources
MGI-BioVU UKB-TOPMed MGI PheWeb

The Problem

[ Headline problem statement. ]

[ Detailed problem context. ]

💡
How might we
[ Framing question for this project. ]

Product Vision

[ The end state. ]

[ Detail on user value. ]

Core architecture

[ Architecture title. ]

[ Technical description: schema normalization, allele orientation alignment, sign consistency. ]

variants
phenotypes
sources

My Contribution

[ Headline of your contribution. ]

[ Detail on what you built and owned. ]

What I worked on

  • [ Parallel crawling across PheWeb instances ]
  • [ Schema normalization + allele orientation alignment ]
  • [ β-matrix construction (sign-consistent) ]
  • [ Truncated SVD for latent space ]
  • [ Validation against known comorbidity pairs ]

Key Achievements

[ Headline outcome. ]

[ Narrative on results. ]

metric one
metric two
metric three

Lessons Learned

[ The single biggest takeaway. ]

[ What worked, what didn't. ]

Where It's Going

[ Next steps. ]

[ Longer-term direction. ]

← Prev: Korean Price Monitor Next: CDC Event Pipeline →