# An Introduction to Directed Acyclic Graphs

## 2018/06/01

library(ggdag)
knitr::opts_chunk$set(cache=TRUE) dagify(y ~ x) %>% ggdag() You also sometimes see edges that look bi-directed, like this: dagify(y ~~ x) %>% ggdag() But this is actually shorthand for an unmeasured cause of the two variables (in other words, unmeasured confounding): # canonicalize the DAG: Add the latent variable in to the graph dagify(y ~~ x) %>% ggdag_canonical()  A DAG is also acyclic, which means that there are no feedback loops; a variable can’t be its own descendant. The above are all DAGs because they are acyclic, but this is not: dagify(y ~ x, x ~ a, a ~ y) %>% ggdag() ## Structural Causal Graphs ggdag is more specifically concerned with structural causal models (SCMs): DAGs that portray causal assumptions about a set of variables. Beyond being useful conceptions of the problem we’re working on (which they are), this also allows us to lean on the well-developed links between graphical causal paths and statistical associations. Causal DAGs are mathematically grounded, but they are also consistent and easy to understand. Thus, when we’re assessing the causal effect between an exposure and an outcome, drawing our assumptions in the form of a DAG can help us pick the right model without having to know much about the math behind it. Another way to think about DAGs is as non-parametric structural equation models (SEM): we are explicitly laying out paths between variables, but in the case of a DAG, it doesn’t matter what form the relationship between two variables takes, only its direction. The rules underpinning DAGs are consistent whether the relationship is a simple, linear one, or a more complicated function. ### Relationships between variables Let’s say we’re looking at the relationship between smoking and cardiac arrest. We might assume that smoking causes changes in cholesterol, which causes cardiac arrest: smoking_ca_dag <- dagify(cardiacarrest ~ cholesterol, cholesterol ~ smoking + weight, smoking ~ unhealthy, weight ~ unhealthy, labels = c("cardiacarrest" = "Cardiac\n Arrest", "smoking" = "Smoking", "cholesterol" = "Cholesterol", "unhealthy" = "Unhealthy\n Lifestyle", "weight" = "Weight"), latent = "unhealthy", exposure = "smoking", outcome = "cardiacarrest") ggdag(smoking_ca_dag, text = FALSE, use_labels = "label") The path from smoking to cardiac arrest is directed: smoking causes cholesterol to rise, which then increases risk for cardiac arrest. Cholesterol is an intermediate variable between smoking and cardiac arrest. Directed paths are also chains, because each is causal on the next. Let’s say we also assume that weight causes cholesterol to rise and thus increases risk of cardiac arrest. Now there’s another chain in the DAG: from weight to cardiac arrest. However, this chain is indirect, at least as far as the relationship between smoking and cardiac arrest goes. We also assume that a person who smokes is more likely to be someone who engages in other unhealthy behaviors, such as overeating. On the DAG, this is portrayed as a latent (unmeasured) node, called unhealthy lifestyle. Having a predilection towards unhealthy behaviors leads to both smoking and increased weight. Here, the relationship between smoking and weight is through a forked path (weight <- unhealthy lifestyle -> smoking) rather than a chain; because they have a mutual parent, smoking and weight are associated (in real life, there’s probably a more direct relationship between the two, but we’ll ignore that for simplicity). Forks and chains are two of the three main types of paths: 1. Chains 2. Forks 3. Inverted forks (paths with colliders) An inverted fork is when two arrowheads meet at a node, which we’ll discuss shortly. There are also common ways of describing the relationships between nodes: parents, children, ancestors, descendants, and neighbors (there are a few others, as well, but they refer to less common relationships). Parents and children refer to direct relationships; descendants and ancestors can be anywhere along the path to or from a node, respectively. Here, smoking and weight are both parents of cholesterol, while smoking and weight are both children of an unhealthy lifestyle. Cardiac arrest is a descendant of an unhealthy lifestyle, which is in turn an ancestor of all nodes in the graph. So, in studying the causal effect of smoking on cardiac arrest, where does this DAG leave us? We only want to know the directed path from smoking to cardiac arrest, but there also exists an indirect, or back-door, path. This is confounding. Judea Pearl, who developed much of the theory of causal graphs, said that confounding is like water in a pipe: it flows freely in open pathways, and we need to block it somewhere along the way. We don’t necessarily need to block the water at multiple points along the same back-door path, although we may have to block more than one path. We often talk about confounders, but really we should talk about confounding, because it is about the pathway more than any particular node along the path. Chains and forks are open pathways, so in a DAG where nothing is conditioned upon, any back-door paths must be one of the two. In addition to the directed pathway to cardiac arrest, there’s also an open back-door path through the forked path at unhealthy lifestyle and on from there through the chain to cardiac arrest: ggdag_paths(smoking_ca_dag, text = FALSE, use_labels = "label") We need to account for this back-door path in our analysis. There are many ways to go about that–stratification, including the variable in a regression model, matching, inverse probability weighting–all with pros and cons. But each strategy must include a decision about which variables to account for. Many analysts take the strategy of putting in all possible confounders. This can be bad news, because adjusting for colliders and mediators can introduce bias, as we’ll discuss shortly. Instead, we’ll look at minimally sufficient adjustment sets: sets of covariates that, when adjusted for, block all back-door paths, but include no more or no less than necessary. That means there can be many minimally sufficient sets, and if you remove even one variable from a given set, a back-door path will open. Some DAGs, like the first one in this vignette (x -> y), have no back-door paths to close, so the minimally sufficient adjustment set is empty (sometimes written as “{}”). Others, like the cyclic DAG above, or DAGs with important variables that are unmeasured, can not produce any sets sufficient to close back-door paths. For the smoking-cardiac arrest question, there is a single set with a single variable: {weight}. Accounting for weight will give us an unbiased estimate of the relationship between smoking and cardiac arrest, assuming our DAG is correct. We do not need to (or want to) control for cholesterol, however, because it’s an intermediate variable between smoking and cardiac arrest; controlling for it blocks the path between the two, which will then bias our estimate (see below for more on mediation). ggdag_adjustment_set(smoking_ca_dag, text = FALSE, use_labels = "label") More complicated DAGs will produce more complicated adjustment sets; assuming your DAG is correct, any given set will theoretically close the back-door path between the outcome and exposure. Still, one set may be better to use than the other, depending on your data. For instance, one set may contain a variable known to have a lot of measurement error or with a lot of missing observations. It may, then, be better to use a set that you think is going to be a better representation of the variables you need to include. Including a variable that doesn’t actually represent the node well will lead to residual confounding. What about controlling for multiple variables along the back-door path, or a variable that isn’t along any back-door path? Even if those variables are not colliders or mediators, it can still cause a problem, depending on your model. Some estimates, like risk ratios, work fine when non-confounders are included. This is because they are collapsible: risk ratios are constant across the strata of non-confounders. Some common estimates, though, like the odds ratio and hazard ratio, are non-collapsible: they are not necessarily constant across strata of non-confounders and thus can be biased by their inclusion. There are situations, like when the outcome is rare in the population (the so-called (rare disease assumption)[https://www.wikiwand.com/en/Rare_disease_assumption]), or when using sophisticated sampling techniques, like (incident-density sampling)[https://www.ctspedia.org/do/view/CTSpedia/SampleIncidence], when they approximate the risk ratio. Otherwise, including extra variables may be problematic. ### Colliders and collider-stratification bias In a path that is an inverted fork (x -> m <- y), the node where two or more arrowheads meet is called a collider (because the paths collide there). An inverted fork is not an open path; it is blocked at the collider. That is to say, we don’t need to account for m to assess for the causal effect of x on y; the back-door path is already blocked by m. Let’s consider an example. Influenza and chicken pox are independent; their causes (influenza viruses and the varicella-zoster virus, respectively) have nothing to do with each other. In real life, there may be some confounders that associate them, like having a depressed immune system, but for this example we’ll assume that they are unconfounded. However, both the flu and chicken pox cause fevers. The DAG looks like this: fever_dag <- collider_triangle(x = "Influenza", y = "Chicken Pox", m = "Fever") ggdag(fever_dag, text = FALSE, use_labels = "label") If we want to assess the causal effect of influenza on chicken pox, we do not need to account for anything. In the terminology used by Pearl, they are already d-separated (direction separated), because there is no effect on one by the other, nor are there any back-door paths: ggdag_dseparated(fever_dag, text = FALSE, use_labels = "label") However, if we control for fever, they become associated within strata of the collider, fever. We open a biasing pathway between the two, and they become d-connected: ggdag_dseparated(fever_dag, controlling_for = "m", text = FALSE, use_labels = "label") This can be counter-intuitive at first. Why does controlling for a confounder reduce bias but adjusting for a collider increase it? It’s because whether or not you have a fever tells me something about your disease. If you have a fever, but you don’t have the flu, I now have more evidence that you have chicken pox. Pearl presents it like algebra: I can’t solve y = 10 + m. But when I know that m = 1, I can solve for y. Unfortunately, there’s a second, less obvious form of collider-stratification bias: adjusting on the descendant of a collider. That means that a variable downstream from the collider can also cause this form of bias. For example, with our flu-chicken pox-fever example, it may be that having a fever leads to people taking a fever reducer, like acetaminophen. Because fever reducers are downstream from fever, controlling for it induces downstream collider-stratification bias: dagify(fever ~ flu + pox, acetaminophen ~ fever, labels = c("flu" = "Influenza", "pox" = "Chicken Pox", "fever" = "Fever", "acetaminophen" = "Acetaminophen")) %>% ggdag_dseparated(from = "flu", to = "pox", controlling_for = "acetaminophen", text = FALSE, use_labels = "label") Collider-stratification bias is responsible for many cases of bias, and it is often not dealt with appropriately. Selection bias, missing data, and publication bias can all be thought of as collider-stratification bias. It becomes trickier in more complicated DAGs; sometimes colliders are also confounders, and we need to either come up with a strategy to adjust for the resulting bias from adjusting the collider, or we need to pick the strategy that’s likely to result in the least amount of bias. See the vignette on (common structures of bias)[https://cran.r-project.org/web/packages/ggdag/vignettes/bias-structures.html] for more. ### Mediation Controlling for intermediate variables may also induce bias, because it decomposes the total effect of x on y into its parts. Depending on the research question, that may be exactly what you want, in which case you should use mediation analysis, e.g. via SEM, which can estimate direct, indirect, and total effects. Let’s return to the smoking example. Here, we only care about how smoking affects cardiac arrest, not the pathways through cholesterol it may take. Moreover, since cholesterol (at least in our DAG) intercepts the only directed pathway from smoking to cardiac arrest, controlling for it will block that relationship; smoking and cardiac arrest will appear unassociated (note that I’m not including the paths opened by controlling for a collider in this plot for clarity): ggdag_dseparated(smoking_ca_dag, controlling_for = c("weight", "cholesterol"), text = FALSE, use_labels = "label", collider_lines = FALSE) Now smoking and cardiac arrest are d-separated. Since our question is about the total effect of smoking on cardiac arrest, our result is now going to be biased. ## Common Structures of Bias ### Introduction In addition to helping identify causal effects, the consistency of directed acyclic graphs (DAGs) is very useful in thinking about different forms of bias. Often, many seemingly unrelated types of bias take the same form in a DAG. Methodological issues in a study often reduce to a problem of 1) not adequately blocking a back-door path or 2) selecting on some variable that turns out to be a collider. confounder_triangle(x = "Coffee", y = "Lung Cancer", z = "Smoking") %>% ggdag_dconnected(text = FALSE, use_labels = "label") But if you think about it, this DAG is incorrect. Smoking doesn’t cause coffee drinking; something else, say a predilection towards addictive substances, causes a person to both smoke and drink coffee. Let’s say that we had a psychometric tool that accurately measures addictive behavior, so we can control for it. Now our DAG looks like this: coffee_dag <- dagify(cancer ~ smoking, smoking ~ addictive, coffee ~ addictive, exposure = "coffee", outcome = "cancer", labels = c("coffee" = "Coffee", "cancer" = "Lung Cancer", "smoking" = "Smoking", "addictive" = "Addictive \nBehavior")) %>% tidy_dagitty(layout = "tree") ggdag(coffee_dag, text = FALSE, use_labels = "label") Which one is the confounder, addictive behavior or smoking? Only one or the other needs to be controlled for to block the path: ggdag_adjustment_set(coffee_dag, text = FALSE, use_labels = "label") Focusing on individual confounders, rather than confounding pathways, is a common error in thinking about adjusting our estimates. We should be more concerned blocking all confounding pathways than with including any particular variable along those pathways. (It’s worth noting, though, that when I say you only need one variable to block a path, that assumes you are measuring and modeling the variable correctly. If, for example, there is measurement error, controlling for that variable may still leave residual confounding, necessitating a more subtle level of control with other variables on the pathway; see the section on measurement error below. Moreover, it’s important to know that you expect to block the path on average; normal sampling issues, e.g. sample size, still come into play.) ### Colliders, M-bias, and butterfly bias Stratifying on a collider is a major culprit in systematic bias. Controlling for a collider–a node where two or more arrow heads meet–induces an association between its parents, through which confounding can flow: collider_triangle() %>% ggdag_dseparated(controlling_for = "m") Here, the example is fairly simple and easy to deal with: m is a child of x and y and just shouldn’t be adjusted for. It can get more complicated, though, when m is something that seems like it is a confounder or when it represents a variable that contributes to whether or not a person enters the study. Often this takes the form of M-shaped bias, or M-bias. Let’s consider an example from Modern Epidemiology: the association between education and diabetes. Let’s assume that lack of education isn’t a direct cause of diabetes. When we are putting together our analysis, we ask: should we adjust for the participant’s mother’s history of diabetes? It’s linked to education via income and to participant’s diabetes status via genetic risk, so it looks like this: m_bias(x = "Education", y = "Diabetes", a = "Income during Childhood", b = "Genetic Risk \nfor Diabetes", m = "Mother's Diabetes") %>% ggdag(use_labels = "label") From a classical confounding perspective, it seems like the mother’s diabetes status might be a confounder: it’s associated with both the exposure and outcome, and it’s not a descendant of either. However, the association with the outcome and exposure is not direct for either; it’s due to confounding by genetic risk and childhood income, respectively. Drawing it as a DAG makes it clear that the mother’s diabetes status is a collider, and adjusting for it will induce an association between genetic risk and childhood income, thus opening a back-door path from education to diabetes status: m_bias(x = "Education", y = "Diabetes", a = "Income during \nChildhood", b = "Genetic Risk \nfor Diabetes", m = "Mother's Diabetes") %>% ggdag_dseparated(controlling_for = "m", use_labels = "label") Again, this is relatively easy–just don’t control for the collider. But what about when your study design already stratifies on m? Let’s consider an example where m represents having surgery. Let’s say we are conducting a study involving participants who had surgery due to back pain. We suspect that people who are ready to commit to physical therapy and changes in their daily life will have a greater decrease in pain after a year than those who aren’t. We’ve developed a psychometric tool that measures readiness for surgery, and we are going to look at the association between readiness and change in pain. The readiness scale depends on an underlying latent variable that is true readiness, which we are measuring with some error. Change in pain depends on baseline pain. Both underlying readiness and baseline pain are also predictive of whether or not someone has surgery. By definition, we are selecting on surgical status: we feel it’s unethical to withhold it and irresponsible to force all participants to have surgery. So, we can’t know how people who didn’t have surgery would react. Surgical status is inherently stratified for, because we’re only looking at people who have had it (note that from here on out, I’m not going to show the paths opened by adjusting for a collider, the default in ggdag, for clarity): coords <- dagitty::coordinates(m_bias()) %>% coords2df() coords$name <- c("readiness", "pain", "surgery", "ready_tool", "pain_change")

outcome = "pain_change",
pain_change = "Change \nin Pain",
pain = "Baseline \nPain",
surgery = "Surgical \nStatus"),
coords = coords2list(coords)) %>%
control_for("surgery")
ggdag_adjust(surgical_dag, text = FALSE, use_labels = "label", collider_lines = FALSE)

We can’t unstratify by surgical status, so our best bet is to block the back-door path opened by stratifying on it. We can’t adjust for the latent variable (and, if we’re measuring it really well, the latent variable and the tool are one in the same), so all we can do is control for baseline pain. Note that, technically, surgical status is part of the adjustment set:

ggdag_adjustment_set(surgical_dag, text = FALSE, use_labels = "label")

What about if a variable is both a collider and a cause of the exposure and outcome (a classical confounder)? This is an extension of M-bias sometimes called butterfly bias or bow-tie bias. The variable is a part of two paths: one that it’s blocking as a collider and a back-door path that is confounding the relationship between x and y. What to do?

ggdag_butterfly_bias(edge_type = "diagonal")

The strategy is basically the same as above; we need to control for m to block the back-door path, but that opens up a relationship between a and b, so we need to block that path, too.

ggdag_adjustment_set(butterfly_bias())

### Measurement error

Measurement error is the degree to which we mismeasure a variable, which can lead to bias in a number of ways. If the error is dependent on the exposure or the outcome, (e.g. we measure the exposure with less accuracy for a group without a disease than with it), it is called differential measurement error. If the error has nothing to do with the exposure or outcome, it’s called non-differential measurement error. Under most conditions, non-differential error will bias the estimate of effect towards the null. In this way, it’s at least predictable, but for small effects or inadequately powered studies, it can make a true effect disappear. If there is error in both the exposure and outcome, the errors themselves can also be associated, opening a back-door path between the exposure and outcome.

Let’s consider an example with non-differential error for the outcome and differential error for the exposure. One common situation where this can occur is recall bias. Let’s say we want to know if taking multivitamins in childhood helps protect against bladder cancer later in life.

# set coordinates
coords <- tibble::tribble(
~name,            ~x,  ~y,
"vitamins",          0,   0,
"diagnosed_bc",      1,   1,
"recalled_vits",     0,   1,
"bc_error",          1,   2,
"vits_error",        0,   2
)

recalled_vits ~ vitamins + vits_error,
vitamins = "Childhood Vitamin \nIntake",
recalled_vits = "Memory of \nTaking Vitamins",
bc_error = "Measurement Error, \nDiagnosis",
vits_error = "Measurement Error, \nVitamins"),
coords = coords)
ggdag(bladder_dag, text = FALSE, use_labels = "label")

For the outcome, the bias only depends on how well diagnosis of bladder cancer represents actually having bladder cancer. For the exposure, however, it also depends on if you have cancer or not: people who are sick tend to spend more time reflecting on what could have caused the illness. Thus, they will remember the vitamins they took as children, on average, a little better than controls. If there is no effect of vitamins on bladder cancer, this dependency will make it seem as if vitamins are a risk for bladder cancer. If it is, in fact, protective, recall bias can reduce or even reverse the association.

Measurement error can also cause bias in longitudinal settings. Let’s consider another example from Modern Epidemiology, which uses the Centers for Epidemiologic Studies–Depression (CES-D) scale, a common tool to assess depression that is known to have measurement error. In this example, the question is if graduating from college with an honors degree affects how depression changes after graduation. Assuming it doesn’t, adjusting for baseline CES-D score–a common practice–can induce bias via measurement error:

# set coordinates
coords <- tibble::tribble(
~name,              ~x,  ~y,
"honors",            1,   3,
"depression",        2,   3,
"cesd",              2,   2,
"baseline_error",    2,   1,
"depression_change", 3,   3,
"cesd_change",       3,   2,
"followup_error",    3,   1
)

cesd_dag <- dagify(depression ~ honors,
cesd ~ depression + baseline_error,
cesd_change ~ depression_change + followup_error + baseline_error,
labels = c(honors = "Honors Degree",
depression = "Depression",
cesd = "CES-D",
cesd_change = "Change \nin CES-D",
depression_change = "Change in \nDepression",
baseline_error = "Measurement Error, \nBaseline",
followup_error = "Measurement Error, \nFollow-up"),
coords = coords)
cesd_dag %>%
ggdag_dconnected(from = "honors", to = "cesd_change", controlling_for = "cesd",
text = FALSE, use_labels = "label", collider_lines = FALSE)

### Selection bias

We have already seen a few examples of selection bias, but let’s consider a couple more that are potential pitfalls in common design types. Let’s say we’re doing a case-control study and want to assess the effect of smoking on glioma, a type of brain cancer. We have a group of glioma patients at a hospital and want to compare them to a group of controls, so we pick people in the hospital with a broken bone, since that seems to have nothing to do with brain cancer. However, perhaps there is some unknown confounding between smoking and being in the hospital with a broken bone, like being prone to reckless behavior. In the normal population, there is no causal effect of smoking on glioma, but in our case, we’re selecting on people who have been hospitalized, which opens up a back-door path:

coords <- tibble::tribble(
~name,           ~x,  ~y,
"glioma",         1,   2,
"hospitalized",   2,   3,
"broken_bone",    3,   2,
"reckless",       4,   1,
"smoking",        5,   2
)
dagify(hospitalized ~ broken_bone + glioma,
broken_bone ~ reckless,
smoking ~ reckless,
labels = c(hospitalized = "Hospitalization",
broken_bone = "Broken Bone",
glioma = "Glioma",
reckless = "Reckless \nBehavior",
smoking = "Smoking"),
coords = coords) %>%
ggdag_dconnected("glioma", "smoking", controlling_for = "hospitalized",
text = FALSE, use_labels = "label", collider_lines = FALSE)

Even though smoking doesn’t actually cause glioma, it will appear as if there is an association. Actually, in this case, it may make smoking appear to be protective against glioma, since controls are more likely to be smokers.

Let’s also consider how bias arises in loss-to-follow-up. In a randomized clinical trial or cohort study, the main threat of selection bias is not through who enters the study (although that may affect generalizability) but who leaves it. If loss-to-follow-up is associated with the exposure or outcome, the relationship between the two may be biased. Let’s consider a trial where we are testing a new HIV drug and its effect on CD4 white blood cell count. If the treatment causes symptoms, participants may leave the trial. Similarly, there may be those whose HIV is getting worse and thus more symptomatic, which also may cause people to leave the trial. If we only have information on people who stay in the study, we are stratifying by follow-up status:

dagify(follow_up ~ symptoms,
symptoms ~ new_rx + dx_severity,
cd4 ~ dx_severity,
labels = c(
follow_up = "Follow-Up",
symptoms = "Symptoms",
new_rx = "New HIV Drug",
dx_severity = "Underyling \nHIV Severity",
cd4 = "CD4 Count"
)) %>%
ggdag_adjust("follow_up", layout = "mds", text = FALSE,
use_labels = "label", collider_lines = FALSE)

But follow-up is downstream from a collider, symptoms. Controlling for a downstream collider induces bias and, because we only have information on people who remain in the study, we are inadvertently stratifying on follow-up status (see the (vignette introducing DAGs)[https://cran.r-project.org/web/packages/ggdag/vignettes/intro-to-dags.html] for more on downstream colliders). Thus, the effect estimate between the HIV drug and CD4 count will be biased.

sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17134)
##
## Matrix products: default
##
## locale:
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## other attached packages:
## [1] bindrcpp_0.2.2 ggdag_0.1.0    ggplot2_2.2.1
##
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.4   xfun_0.1           purrr_0.2.5
##  [4] V8_1.5             colorspace_1.3-2   htmltools_0.3.6
##  [7] viridisLite_0.3.0  yaml_2.1.19        rlang_0.2.1
## [10] pillar_1.2.3       glue_1.2.0         tweenr_0.1.5
## [13] RColorBrewer_1.1-2 bindr_0.1.1        plyr_1.8.4
## [16] stringr_1.3.1      munsell_0.4.3      blogdown_0.6
## [19] gtable_0.2.0       codetools_0.2-15   evaluate_0.10.1
## [22] labeling_0.3       knitr_1.20         forcats_0.3.0
## [25] curl_3.2           dagitty_0.2-2      tidygraph_1.1.0
## [28] Rcpp_0.12.17       udunits2_0.13      scales_0.5.0
## [31] backports_1.1.2    jsonlite_1.5       gridExtra_2.3
## [34] ggforce_0.1.2      digest_0.6.15      stringi_1.1.7
## [37] bookdown_0.7       dplyr_0.7.5        ggrepel_0.8.0
## [40] grid_3.4.4         rprojroot_1.3-2    tools_3.4.4
## [43] magrittr_1.5       lazyeval_0.2.1     tibble_1.4.2
## [46] ggraph_1.0.1       tidyr_0.8.1        pkgconfig_2.0.1
## [49] MASS_7.3-50        assertthat_0.2.0   rmarkdown_1.9
## [52] viridis_0.5.1      R6_2.2.2           boot_1.3-20
## [55] units_0.5-1        igraph_1.2.1       compiler_3.4.4
## Adding cites for R packages using knitr
knitr::write_bib(.packages(), "packages.bib")

# References

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.