Writing on open-source data science.

Posts about data analysis, modeling, package development, causal inference, and more. To see my full archive of blog posts prior to 2023, check out this repository.

Data science as an atomic habit

The most important habits are small and consistent, whose benefits slowly build and compound over time.

Tidy causal DAGs with ggdag 0.2.0

Learn about new features and looks in the ggdag package for making and analyzing causal DAGs.

Introducing the partition package

Introducing partition, a fast and flexible data reduction framework that minimizes information loss and creates interpretable clusters.