PheWeb β-Matrix Builder
[ A parallel ETL pulling PheWAS GWAS data from multiple PheWeb instances into a unified β-matrix. ]
Overview
[ Headline summary. ]
[ Detailed overview — what was built, why. ]
Technologies
language
Python
data & ml
pandas
NumPy
scikit-learn
infrastructure
parallel ETL
requests
data sources
MGI-BioVU
UKB-TOPMed
MGI PheWeb
The Problem
[ Headline problem statement. ]
[ Detailed problem context. ]
💡
How might we
[ Framing question for this project. ]
Product Vision
[ The end state. ]
[ Detail on user value. ]
Core architecture
[ Architecture title. ]
[ Technical description: schema normalization, allele orientation alignment, sign consistency. ]
—
variants
—
phenotypes
—
sources
My Contribution
[ Headline of your contribution. ]
[ Detail on what you built and owned. ]
What I worked on
- [ Parallel crawling across PheWeb instances ]
- [ Schema normalization + allele orientation alignment ]
- [ β-matrix construction (sign-consistent) ]
- [ Truncated SVD for latent space ]
- [ Validation against known comorbidity pairs ]
Key Achievements
[ Headline outcome. ]
[ Narrative on results. ]
—
metric one
—
metric two
—
metric three
Lessons Learned
[ The single biggest takeaway. ]
[ What worked, what didn't. ]
Where It's Going
[ Next steps. ]
[ Longer-term direction. ]