California Dialysis Regulation
Data Engineering and Statistical Modeling
- Engineered a multi-stage ETL pipeline in Python to integrate and clean 1M+ semi-structured records.
- Developed interactive geospatial visualizations and animated time series graphs.
- Created custom database solution enabling complex cross-table analysis and feature importance modeling.
- Applied feature importance analysis, to identify key predictors and improve model prediction accuracy.
Diabetes Prediction Model
ML Development
- Developed end-to-end diabetes prediction ML pipeline, evaluating multiple classifier architectures.
- Implemented systematic imputation and transformation pipelines and feature engineering.
- Implemented ensemble models and cross-validation for improved model performance and robustness.