An R Package for visualizing and analyzing causal directed acyclic graphs


A fast and flexible framework for agglomerative partitioning in R


Tidy and plot meta-analyses in R


An R Package to write YAML for R Markdown, bookdown, blogdown, and more

Functional programming with purrr

One of R’s most powerful features is that it’s a functional programming language. purrr is a consistent and efficient toolkit for programming with functions and working with lists. At its heart is `map()` and friends: functions for the common pattern …

Tidy causal DAGs with ggdag 0.2.0

I’m please to announce that ggdag 0.2.0 is now on CRAN! ggdag links the dagitty package, which contains powerful algorithms for analyzing causal DAGs, with the unlimited flexibility of ggplot2. ggdag coverts dagitty objects to a tidy DAG data structure, which allows you to both analyze your DAG and plot it easily in ggplot2. Let’s look at an example for a causal diagram of the effect of smoking on cardiac arrest.

#SER2019 In Review

The annual meeting of the Society for Epidemiologic Research (SER) took place June 18-21. The past two years, I’ve collected Twitter data (2018, 2019). The data were collected with the excellent rtweet package, and the data collection code was based on related code by Mike Kearney, the author of rtweet. Setup # for everything else :) library(tidyverse) # for tidy eval library(rlang) # for labeling tweets in plots library(ggrepel) # for network graphs library(ggraph) library(tidygraph) # for text analysis library(tidytext) Since the data were collected over several days, I’m going to read the saved data straight from GitHub.

Introducing the partition package

I’m pleased to announce the CRAN release of partition 0.1.0. partition is a fast and flexible data reduction framework that minimizes information loss and creates interpretable clusters. partition uses agglomorative clustering: it starts from the ground up, matching pairs of variables and assessing the amount of information that would be explained by their reduction. If the information is above this user-specified threshold, the data is reduced. This type of reduction is particularly useful in very redundant data, such as high-resolution genetic data.

Why should I use the here package when I'm already using projects?

TL;DR: Why should I use here? The here package bottles up several small best practices for referencing files in your project. You could manufacture most of these yourself using a combination of RStudio projects and clever file paths, but the here package is useful because it streamlines these practices without you needing to think about it. The main benefits: here works from the project up. That makes it easy to reference other sub-folders in your directory.

When interaction is not interaction: confounding and measurement error

Last week, I presented ggdag at JSM in Vancouver. As you can imagine, I had a lot of conversations with people about DAGs, confounding, colliders, and all the types of bias that can arise in research. One strange type of bias came up a couple of times that I don’t see discussed very often: measuring either the effect you are studying (x) or a variable along a confounding pathway (z) incorrectly can make it appear as if there is an interaction between x and z, even if there isn’t one.